github.com/graybobo/golang.org-package-offline-cache@v0.0.0-20200626051047-6608995c132f/x/blog/content/slices.article (about) 1 Arrays, slices (and strings): The mechanics of 'append' 2 26 Sep 2013 3 Tags: array, slice, string, copy, append 4 5 Rob Pike 6 7 * Introduction 8 9 One of the most common features of procedural programming languages is 10 the concept of an array. 11 Arrays seem like simple things but there are many questions that must be 12 answered when adding them to a language, such as: 13 14 - fixed-size or variable-size? 15 - is the size part of the type? 16 - what do multidimensional arrays look like? 17 - does the empty array have meaning? 18 19 The answers to these questions affect whether arrays are just 20 a feature of the language or a core part of its design. 21 22 In the early development of Go, it took about a year to decide the answers 23 to these questions before the design felt right. 24 The key step was the introduction of _slices_, which built on fixed-size 25 _arrays_ to give a flexible, extensible data structure. 26 To this day, however, programmers new to Go often stumble over the way slices 27 work, perhaps because experience from other languages has colored their thinking. 28 29 In this post we'll attempt to clear up the confusion. 30 We'll do so by building up the pieces to explain how the `append` built-in function 31 works, and why it works the way it does. 32 33 * Arrays 34 35 Arrays are an important building block in Go, but like the foundation of a building 36 they are often hidden below more visible components. 37 We must talk about them briefly before we move on to the more interesting, 38 powerful, and prominent idea of slices. 39 40 Arrays are not often seen in Go programs because 41 the size of an array is part of its type, which limits its expressive power. 42 43 The declaration 44 45 .code slices/prog010.go /var buffer/ 46 47 declares the variable `buffer`, which holds 256 bytes. 48 The type of `buffer` includes its size, `[256]byte`. 49 An array with 512 bytes would be of the distinct type `[512]byte`. 50 51 The data associated with an array is just that: an array of elements. 52 Schematically, our buffer looks like this in memory, 53 54 buffer: byte byte byte ... 256 times ... byte byte byte 55 56 That is, the variable holds 256 bytes of data and nothing else. We can 57 access its elements with the familiar indexing syntax, `buffer[0]`, `buffer[1]`, 58 and so on through `buffer[255]`. (The index range 0 through 255 covers 59 256 elements.) Attempting to index `buffer` with a value outside this 60 range will crash the program. 61 62 There is a built-in function called `len` that returns the number of elements 63 of an array or slice and also of a few other data types. 64 For arrays, it's obvious what `len` returns. 65 In our example, `len(buffer)` returns the fixed value 256. 66 67 Arrays have their place—they are a good representation of a transformation 68 matrix for instance—but their most common purpose in Go is to hold storage 69 for a slice. 70 71 * Slices: The slice header 72 73 Slices are where the action is, but to use them well one must understand 74 exactly what they are and what they do. 75 76 A slice is a data structure describing a contiguous section of an array 77 stored separately from the slice variable itself. 78 _A_slice_is_not_an_array_. 79 A slice _describes_ a piece of an array. 80 81 Given our `buffer` array variable from the previous section, we could create 82 a slice that describes elements 100 through 150 (to be precise, 100 through 149, 83 inclusive) by _slicing_ the array: 84 85 .code slices/prog010.go /var slice/ 86 87 In that snippet we used the full variable declaration to be explicit. 88 The variable `slice` has type `[]byte`, pronounced "slice of bytes", 89 and is initialized from the array, called 90 `buffer`, by slicing elements 100 (inclusive) through 150 (exclusive). 91 The more idiomatic syntax would drop the type, which is set by the initializing expression: 92 93 var slice = buffer[100:150] 94 95 Inside a function we could use the short declaration form, 96 97 slice := buffer[100:150] 98 99 What exactly is this slice variable? 100 It's not quite the full story, but for now think of a 101 slice as a little data structure with two elements: a length and a pointer to an element 102 of a array. 103 You can think of it as being built like this behind the scenes: 104 105 type sliceHeader struct { 106 Length int 107 ZerothElement *byte 108 } 109 110 slice := sliceHeader{ 111 Length: 50, 112 ZerothElement: &buffer[100], 113 } 114 115 Of course, this is just an illustration. 116 Despite what this snippet says that `sliceHeader` struct is not visible 117 to the programmer, and the type 118 of the element pointer depends on the type of the elements, 119 but this gives the general idea of the mechanics. 120 121 So far we've used a slice operation on an array, but we can also slice a slice, like this: 122 123 slice2 := slice[5:10] 124 125 Just as before, this operation creates a new slice, in this case with elements 126 5 through 9 (inclusive) of the original slice, which means elements 127 105 through 109 of the original array. 128 The underlying `sliceHeader` struct for the `slice2` variable looks like 129 this: 130 131 slice2 := sliceHeader{ 132 Length: 5, 133 ZerothElement: &buffer[105], 134 } 135 136 Notice that this header still points to the same underlying array, stored in 137 the `buffer` variable. 138 139 We can also _reslice_, which is to say slice a slice and store the result back in 140 the original slice structure. After 141 142 slice = slice[5:10] 143 144 the `sliceHeader` structure for the `slice` variable looks just like it did for the `slice2` 145 variable. 146 You'll see reslicing used often, for example to truncate a slice. This statement drops 147 the first and last elements of our slice: 148 149 slice = slice[1:len(slice)-1] 150 151 [Exercise: Write out what the `sliceHeader` struct looks like after this assignment.] 152 153 You'll often hear experienced Go programmers talk about the "slice header" 154 because that really is what's stored in a slice variable. 155 For instance, when you call a function that takes a slice as an argument, such as 156 [[http://golang.org/pkg/bytes/#IndexRune][bytes.IndexRune]], that header is 157 what gets passed to the function. 158 In this call, 159 160 slashPos := bytes.IndexRune(slice, '/') 161 162 the `slice` argument that is passed to the `IndexRune` function is, in fact, 163 a "slice header". 164 165 There's one more data item in the slice header, which we talk about below, 166 but first let's see what the existence of the slice header means when you 167 program with slices. 168 169 * Passing slices to functions 170 171 It's important to understand that even though a slice contains a pointer, 172 it is itself a value. 173 Under the covers, it is a struct value holding a pointer and a length. 174 It is _not_ a pointer to a struct. 175 176 This matters. 177 178 When we called `IndexRune` in the previous example, 179 it was passed a _copy_ of the slice header. 180 That behavior has important ramifications. 181 182 Consider this simple function: 183 184 .code slices/prog010.go /^func/,/^}/ 185 186 It does just what its name implies, iterating over the indices of a slice 187 (using a `for` `range` loop), incrementing its elements. 188 189 Try it: 190 191 .play -edit slices/prog010.go /^func main/,/^}/ 192 193 (You can edit and re-execute these runnable snippets if you want to explore.) 194 195 Even though the slice _header_ is passed by value, the header includes 196 a pointer to elements of an array, so both the original slice header 197 and the copy of the header passed to the function describe the same 198 array. 199 Therefore, when the function returns, the modified elements can 200 be seen through the original slice variable. 201 202 The argument to the function really is a copy, as this example shows: 203 204 .play -edit slices/prog020.go /^func/,$ 205 206 Here we see that the _contents_ of a slice argument can be modified by a function, 207 but its _header_ cannot. 208 The length stored in the `slice` variable is not modified by the call to the function, 209 since the function is passed a copy of the slice header, not the original. 210 Thus if we want to write a function that modifies the header, we must return it as a result 211 parameter, just as we have done here. 212 The `slice` variable is unchanged but the returned value has the new length, 213 which is then stored in `newSlice`, 214 215 * Pointers to slices: Method receivers 216 217 Another way to have a function modify the slice header is to pass a pointer to it. 218 Here's a variant of our previous example that does this: 219 220 .play -edit slices/prog030.go /^func/,$ 221 222 It seems clumsy in that example, especially dealing with the extra level of indirection 223 (a temporary variable helps), 224 but there is one common case where you see pointers to slices. 225 It is idiomatic to use a pointer receiver for a method that modifies a slice. 226 227 Let's say we wanted to have a method on a slice that truncates it at the final slash. 228 We could write it like this: 229 230 .play -edit slices/prog040.go /^type/,$ 231 232 If you run this example you'll see that it works properly, updating the slice in the caller. 233 234 [Exercise: Change the type of the receiver to be a value rather 235 than a pointer and run it again. Explain what happens.] 236 237 On the other hand, if we wanted to write a method for `path` that upper-cases 238 the ASCII letters in the path (parochially ignoring non-English names), the method could 239 be a value because the value receiver will still point to the same underlying array. 240 241 .play -edit slices/prog050.go /^type/,$ 242 243 Here the `ToUpper` method uses two variables in the `for` `range` construct 244 to capture the index and slice element. 245 This form of loop avoids writing `p[i]` multiple times in the body. 246 247 [Exercise: Convert the `ToUpper` method to use a pointer receiver and see if its behavior changes.] 248 249 [Advanced exercise: Convert the `ToUpper` method to handle Unicode letters, not just ASCII.] 250 251 * Capacity 252 253 Look at the following function that extends its argument slice of `ints` by one element: 254 255 .code slices/prog060.go /^func Extend/,/^}/ 256 257 (Why does it need to return the modified slice?) Now run it: 258 259 .play -edit slices/prog060.go /^func main/,/^}/ 260 261 See how the slice grows until... it doesn't. 262 263 It's time to talk about the third component of the slice header: its _capacity_. 264 Besides the array pointer and length, the slice header also stores its capacity: 265 266 type sliceHeader struct { 267 Length int 268 Capacity int 269 ZerothElement *byte 270 } 271 272 The `Capacity` field records how much space the underlying array actually has; it is the maximum 273 value the `Length` can reach. 274 Trying to grow the slice beyond its capacity will step beyond the limits of the array and will trigger a panic. 275 276 After our example slice is created by 277 278 slice := iBuffer[0:0] 279 280 its header looks like this: 281 282 slice := sliceHeader{ 283 Length: 0, 284 Capacity: 10, 285 ZerothElement: &iBuffer[0], 286 } 287 288 The `Capacity` field is equal to the length of the underlying array, 289 minus the index in the array of the first element of the slice (zero in this case). 290 If you want to inquire what the capacity is for a slice, use the built-in function `cap`: 291 292 if cap(slice) == len(slice) { 293 fmt.Println("slice is full!") 294 } 295 296 * Make 297 298 What if we want to grow the slice beyond its capacity? 299 You can't! 300 By definition, the capacity is the limit to growth. 301 But you can achieve an equivalent result by allocating a new array, copying the data over, and modifying 302 the slice to describe the new array. 303 304 Let's start with allocation. 305 We could use the `new` built-in function to allocate a bigger array 306 and then slice the result, 307 but it is simpler to use the `make` built-in function instead. 308 It allocates a new array and 309 creates a slice header to describe it, all at once. 310 The `make` function takes three arguments: the type of the slice, its initial length, and its capacity, which is the 311 length of the array that `make` allocates to hold the slice data. 312 This call creates a slice of length 10 with room for 5 more (15-10), as you can see by running it: 313 314 .play -edit slices/prog070.go /slice/,/fmt/ 315 316 This snippet doubles the capacity of our `int` slice but keeps its length the same: 317 318 .play -edit slices/prog080.go /slice/,/OMIT/ 319 320 After running this code the slice has much more room to grow before needing another reallocation. 321 322 When creating slices, it's often true that the length and capacity will be same. 323 The `make` built-in has a shorthand for this common case. 324 The length argument defaults to the capacity, so you can leave it out 325 to set them both to the same value. 326 After 327 328 gophers := make([]Gopher, 10) 329 330 the `gophers` slice has both its length and capacity set to 10. 331 332 * Copy 333 334 When we doubled the capacity of our slice in the previous section, 335 we wrote a loop to copy the old data to the new slice. 336 Go has a built-in function, `copy`, to make this easier. 337 Its arguments are two slices, and it copies the data from the right-hand argument to the left-hand argument. 338 Here's our example rewritten to use `copy`: 339 340 .play -edit slices/prog090.go /newSlice/,/newSlice/ 341 342 The `copy` function is smart. 343 It only copies what it can, paying attention to the lengths of both arguments. 344 In other words, the number of elements it copies is the minimum of the lengths of the two slices. 345 This can save a little bookkeeping. 346 Also, `copy` returns an integer value, the number of elements it copied, although it's not always worth checking. 347 348 The `copy` function also gets things right when source and destination overlap, which means it can be used to shift 349 items around in a single slice. 350 Here's how to use `copy` to insert a value into the middle of a slice. 351 352 .code slices/prog100.go /Insert/,/^}/ 353 354 There are a couple of things to notice in this function. 355 First, of course, it must return the updated slice because its length has changed. 356 Second, it uses a convenient shorthand. 357 The expression 358 359 slice[i:] 360 361 means exactly the same as 362 363 slice[i:len(slice)] 364 365 Also, although we haven't used the trick yet, we can leave out the first element of a slice expression too; 366 it defaults to zero. Thus 367 368 slice[:] 369 370 just means the slice itself, which is useful when slicing an array. 371 This expression is the shortest way to say "a slice describing all the elements of the array": 372 373 array[:] 374 375 Now that's out of the way, let's run our `Insert` function. 376 377 .play -edit slices/prog100.go /make/,/OMIT/ 378 379 * Append: An example 380 381 A few sections back, we wrote an `Extend` function that extends a slice by one element. 382 It was buggy, though, because if the slice's capacity was too small, the function would 383 crash. 384 (Our `Insert` example has the same problem.) 385 Now we have the pieces in place to fix that, so let's write a robust implementation of 386 `Extend` for integer slices. 387 388 .code slices/prog110.go /func Extend/,/^}/ 389 390 In this case it's especially important to return the slice, since when it reallocates 391 the resulting slice describes a completely different array. 392 Here's a little snippet to demonstrate what happens as the slice fills up: 393 394 .play -edit slices/prog110.go /START/,/END/ 395 396 Notice the reallocation when the initial array of size 5 is filled up. 397 Both the capacity and the address of the zeroth element change when the new array is allocated. 398 399 With the robust `Extend` function as a guide we can write an even nicer function that lets 400 us extend the slice by multiple elements. 401 To do this, we use Go's ability to turn a list of function arguments into a slice when the 402 function is called. 403 That is, we use Go's variadic function facility. 404 405 Let's call the function `Append`. 406 For the first version, we can just call `Extend` repeatedly so the mechanism of the variadic function is clear. 407 The signature of `Append` is this: 408 409 func Append(slice []int, items ...int) []int 410 411 What that says is that `Append` takes one argument, a slice, followed by zero or more 412 `int` arguments. 413 Those arguments are exactly a slice of `int` as far as the implementation 414 of `Append` is concerned, as you can see: 415 416 .code slices/prog120.go /Append/,/^}/ 417 418 Notice the `for` `range` loop iterating over the elements of the `items` argument, which has implied type `[]int`. 419 Also notice the use of the blank identifier `_` to discard the index in the loop, which we don't need in this case. 420 421 Try it: 422 423 .play -edit slices/prog120.go /START/,/END/ 424 425 Another new technique is in this example is that we initialize the slice by writing a composite literal, 426 which consists of the type of the slice followed by its elements in braces: 427 428 .code slices/prog120.go /slice := / 429 430 The `Append` function is interesting for another reason. 431 Not only can we append elements, we can append a whole second slice 432 by "exploding" the slice into arguments using the `...` notation at the call site: 433 434 .play -edit slices/prog130.go /START/,/END/ 435 436 Of course, we can make `Append` more efficient by allocating no more than once, 437 building on the innards of `Extend`: 438 439 .code slices/prog140.go /Append/,/^}/ 440 441 Here, notice how we use `copy` twice, once to move the slice data to the newly 442 allocated memory, and then to copy the appending items to the end of the old data. 443 444 Try it; the behavior is the same as before: 445 446 .play -edit slices/prog140.go /START/,/END/ 447 448 * Append: The built-in function 449 450 And so we arrive at the motivation for the design of the `append` built-in function. 451 It does exactly what our `Append` example does, with equivalent efficiency, but it 452 works for any slice type. 453 454 A weakness of Go is that any generic-type operations must be provided by the 455 run-time. Some day that may change, but for now, to make working with slices 456 easier, Go provides a built-in generic `append` function. 457 It works the same as our `int` slice version, but for _any_ slice type. 458 459 Remember, since the slice header is always updated by a call to `append`, you need 460 to save the returned slice after the call. 461 In fact, the compiler won't let you call append without saving the result. 462 463 Here are some one-liners intermingled with print statements. Try them, edit them and explore: 464 465 .play -edit slices/prog150.go /START/,/END/ 466 467 It's worth taking a moment to think about the final one-liner of that example in detail to understand 468 how the design of slices makes it possible for this simple call to work correctly. 469 470 There are lots more examples of `append`, `copy`, and other ways to use slices 471 on the community-built 472 [[https://golang.org/wiki/SliceTricks]["Slice Tricks" Wiki page]]. 473 474 * Nil 475 476 As an aside, with our newfound knowledge we can see what the representation of a `nil` slice is. 477 Naturally, it is the zero value of the slice header: 478 479 sliceHeader{ 480 Length: 0, 481 Capacity: 0, 482 ZerothElement: nil, 483 } 484 485 or just 486 487 sliceHeader{} 488 489 The key detail is that the element pointer is `nil` too. The slice created by 490 491 array[0:0] 492 493 has length zero (and maybe even capacity zero) but its pointer is not `nil`, so 494 it is not a nil slice. 495 496 As should be clear, an empty slice can grow (assuming it has non-zero capacity), but a `nil` 497 slice has no array to put values in and can never grow to hold even one element. 498 499 That said, a `nil` slice is functionally equivalent to a zero-length slice, even though it points 500 to nothing. 501 It has length zero and can be appended to, with allocation. 502 As an example, look at the one-liner above that copies a slice by appending 503 to a `nil` slice. 504 505 * Strings 506 507 Now a brief section about strings in Go in the context of slices. 508 509 Strings are actually very simple: they are just read-only slices of bytes with a bit 510 of extra syntactic support from the language. 511 512 Because they are read-only, there is no need for a capacity (you can't grow them), 513 but otherwise for most purposes you can treat them just like read-only slices 514 of bytes. 515 516 For starters, we can index them to access individual bytes: 517 518 slash := "/usr/ken"[0] // yields the byte value '/'. 519 520 We can slice a string to grab a substring: 521 522 usr := "/usr/ken"[0:4] // yields the string "/usr" 523 524 It should be obvious now what's going on behind the scenes when we slice a string. 525 526 We can also take a normal slice of bytes and create a string from it with the simple conversion: 527 528 str := string(slice) 529 530 and go in the reverse direction as well: 531 532 slice := []byte(usr) 533 534 The array underlying a string is hidden from view; there is no way to access its contents 535 except through the string. That means that when we do either of these conversions, a 536 copy of the array must be made. 537 Go takes care of this, of course, so you don't have to. 538 After either of these conversions, modifications to 539 the array underlying the byte slice don't affect the corresponding string. 540 541 An important consequence of this slice-like design for strings is that 542 creating a substring is very efficient. 543 All that needs to happen 544 is the creation of a two-word string header. Since the string is read-only, the original 545 string and the string resulting from the slice operation can share the same array safely. 546 547 A historical note: The earliest implementation of strings always allocated, but when slices 548 were added to the language, they provided a model for efficient string handling. Some of 549 the benchmarks saw huge speedups as a result. 550 551 There's much more to strings, of course, but they are a topic for another post. 552 553 * Conclusion 554 555 To understand how slices work, it helps to understand how they are implemented. 556 There is a little data structure, the slice header, that is the item associated with the slice 557 variable, and that header describes a section of a separately allocated array. 558 When we pass slice values around, the header gets copied but the array it points 559 to is always shared. 560 561 Once you appreciate how they work, slices become not only easy to use, but 562 powerful and expressive, especially with the help of the `copy` and `append` 563 built-in functions. 564 565 * More reading 566 567 There's lots to find around the intertubes about slices in Go. 568 As mentioned earlier, 569 the [[https://golang.org/wiki/SliceTricks]["Slice Tricks" Wiki page]] 570 has many examples. 571 The [[http://blog.golang.org/go-slices-usage-and-internals][Go Slices]] blog post 572 describes the memory layout details with clear diagrams. 573 Russ Cox's [[http://research.swtch.com/godata][Go Data Structures]] article includes 574 a discussion of slices along with some of Go's other internal data structures. 575 576 There is much more material available, but the best way to learn about slices is to use them.