Home  >  Article  >  System Tutorial  >  Tips for working with slice capacity and length in Go

Tips for working with slice capacity and length in Go

PHPz
PHPzforward
2024-03-20 14:36:28493browse

在 Go 中使用切片的容量和长度的技巧

Quick Test - What does the following code output?

vals := make([]int, 5)
for i := 0; i < 5; i {
  vals = append(vals, i)
}
fmt.Println(vals)

If you guessed [0 0 0 0 0 0 1 2 3 4], you are right.

If you make a mistake in the test, you don't have to worry. This is a fairly common mistake when transitioning to the Go language, and in this article we'll explain why the output isn't what you expected, and how to take advantage of Go's nuances to make your code more efficient.

Slice vs Array

There are both arrays and slices in Go. It can be confusing, but once you get used to it, you'll love it. Please trust me.

There are many differences between slices and arrays, but the one we are going to focus on in this article is that the size of an array is part of its type, whereas slices can have dynamic sizes because they are a wrapper around an array.

What does this mean in practice? So let's say we have the array val a[10]int. The array has a fixed size and cannot be changed. If we call len(a), it always returns 10 because this size is part of the type. So if you suddenly need more than 10 items in the array, you have to create a new object of a completely different type, say val b[11]int, and then copy all the values ​​from a to b.

There are specific situations where collection-sized arrays can be valuable, but generally speaking, this is not what developers want. Instead, they wanted to use something similar to arrays in Go, but with the ability to grow over time. A crude way is to create an array much larger than it needs to be, and then treat a subset of the array as an array. The code below is an example.

var vals [20]int
for i := 0; i < 5; i {
  vals[i] = i * i
}
subsetLen := 5

fmt.Println("The subset of our array has a length of:", subsetLen)

//Add a new item to our array
vals[subsetLen] = 123
subsetLen
fmt.Println("The subset of our array has a length of:", subsetLen)

In the code, we have an array of length 20, but since we are only using a subset, in the code we can assume that the length of the array is 5, and then 6 after we add a new item to the array .

This is (very roughly) how slicing works. They contain an array with a set size, like the array in our previous example, which had size 20.

They also keep track of the subset of the array used in the program - this is the append attribute, which is similar to the subsetLen variable in the previous example.

Finally, a slice also has a capacity, which is similar to the total length of our array in the previous example (20). This is useful because it tells the size your subset can grow before it no longer fits the slice array. When this happens, a new array needs to be allocated, but all this logic is hidden behind the append function.

In short, combining slices using the append function gives us a very array-like type, but over time it can handle many more elements.

Let’s look at the previous example again, but this time we’ll use slices instead of arrays.

var vals []int
for i := 0; i < 5; i {
  vals = append(vals, i)
  fmt.Println("The length of our slice is:", len(vals))
  fmt.Println("The capacity of our slice is:", cap(vals))
}

//Add a new item to our array
vals = append(vals, 123)
fmt.Println("The length of our slice is:", len(vals))
fmt.Println("The capacity of our slice is:", cap(vals))

// Accessing items is the same as an array
fmt.Println(vals[5])
fmt.Println(vals[2])

We can still access the elements in our slice just like an array, but by using slices and the append function, we no longer need to consider the size of the array behind it. We can still figure this stuff out by using the len and cap functions, but we don't have to worry about it as much. Simple, right?

Back to testing

With this in mind, let’s review the previous test and see what went wrong.

vals := make([]int, 5)
for i := 0; i < 5; i {
  vals = append(vals, i)
}
fmt.Println(vals)

When calling make, we allow up to 3 parameters to be passed in. The first is the type we allocated, the second is the "length" of the type, and the third is the "capacity" of the type (this parameter is optional).

By passing the argument make([]int, 5), we tell the program that we want to create a slice of length 5, in which case the default capacity is the same as the length - 5 in this case.

While this may look like what we want, the important difference here is that we tell our slice that we want to set the "length" and "capacity" to 5, assuming you want 5 elements in the initial After adding new elements, we then call the append function, which will increase the size of the capacity and add new elements at the end of the slice.

If you add a Println() statement to the code, you can see the capacity change.

vals := make([]int, 5)
fmt.Println("Capacity was:", cap(vals))
for i := 0; i < 5; i {
  vals = append(vals, i)
  fmt.Println("Capacity is now:", cap(vals))
}

fmt.Println(vals)

In the end, we end up with the output of [0 0 0 0 0 0 1 2 3 4] instead of the desired [0 1 2 3 4].

How to fix it? Okay, there are a few ways to do this, we're going to cover two, and you can pick whichever method is most useful in your scenario.

Use index writing directly instead of append

The first fix is ​​to leave the make call unchanged and explicitly set each element using the index. In this way, we get the following code:

vals := make([]int, 5)
for i := 0; i < 5; i {
  vals[i] = i
}
fmt.Println(vals)

In this case, we set the value to exactly the same index we want to use, but you can also track the index independently.

For example, if you want to get the key of the map, you can use the following code.

package main

import "fmt"

func main() {
  fmt.Println(keys(map[string]struct{}{
    "dog": struct{}{},
    "cat": struct{}{},
  }))
}

func keys(m map[string]struct{}) []string {
  ret := make([]string, len(m))
  i := 0
  for key := range m {
    ret[i] = key
    i
  }
  return ret
}

This is good because we know that the length of the slice we return will be the same as the length of the map, so we can initialize our slice with that length and then assign each element to the appropriate index. The disadvantage of this approach is that we have to keep track of i in order to know what value to set for each index.

This brings us to the second method...

Use 0 as your length and specify the capacity

Instead of keeping track of the index of the value we want to add, we can update our make call and provide two arguments after the slice type. First, the length of our new slice will be set to 0 because we haven't added any new elements to the slice yet. Second, the capacity of our new slice will be set to the length of the map parameter, since we know our slice will end up adding many strings.

This will still build the same array behind the scenes as the previous example, but now when we call append it will place them at the beginning of the slice since the length of the slice is 0.

package main

import "fmt"

func main() {
  fmt.Println(keys(map[string]struct{}{
    "dog": struct{}{},
    "cat": struct{}{},
  }))
}

func keys(m map[string]struct{}) []string {
  ret := make([]string, 0, len(m))
  for key := range m {
    ret = append(ret, key)
  }
  return ret
}
If append handles it, why do we have to worry about capacity?

Next you may ask: "If the append function can increase the capacity of the slice for me, then why do we need to tell the program the capacity?"

The truth is, in most cases, you don't have to worry about this too much. If it makes your code more complex, just initialize your slice with var vals []int and let the append function handle the rest.

But this situation is different. It's not an example of the difficulty of declaring capacity, in fact it's easy to determine the final capacity of our slice since we know it will map directly into the provided map. Therefore, when we initialize it, we can declare the capacity of the slice and save our program from performing unnecessary memory allocations.

If you want to see additional memory allocations, run the following code on the Go Playground. Every time the capacity is increased, the program needs to allocate memory.

package main

import "fmt"

func main() {
  fmt.Println(keys(map[string]struct{}{
    "dog": struct{}{},
    "cat": struct{}{},
    "mouse": struct{}{},
    "wolf": struct{}{},
    "alligator": struct{}{},
  }))
}

func keys(m map[string]struct{}) []string {
  var ret[]string
  fmt.Println(cap(ret))
  for key := range m {
    ret = append(ret, key)
    fmt.Println(cap(ret))
  }
  return ret
}

Now compare this to the same code but with a predefined capacity.

package main

import "fmt"

func main() {
  fmt.Println(keys(map[string]struct{}{
    "dog": struct{}{},
    "cat": struct{}{},
    "mouse": struct{}{},
    "wolf": struct{}{},
    "alligator": struct{}{},
  }))
}

func keys(m map[string]struct{}) []string {
  ret := make([]string, 0, len(m))
  fmt.Println(cap(ret))
  for key := range m {
    ret = append(ret, key)
    fmt.Println(cap(ret))
  }
  return ret
}

In the first code example, our capacity started at 0, then increased to 1, 2, 4, and finally 8, which meant that we had to allocate the array 5 times, the last one holding the array we sliced The capacity is 8, which is larger than we ultimately need.

On the other hand, our second example starts and ends with the same capacity (5), it only needs to be allocated once at the beginning of the keys() function. We also avoid wasting any extra memory and return a perfectly sized slice that can fit this array.

Don’t over-optimize

As mentioned before, I usually discourage anyone from doing small optimizations like this, but if the effect of the final size is really noticeable, then I strongly recommend that you try setting an appropriate capacity or length for the slice.

Not only does this help improve the performance of your program, it also helps clarify your code by explicitly stating the relationship between the size of the input and the size of the output.

Summarize

This article is not a detailed discussion of the differences between slices or arrays, but a brief introduction to how capacity and length affect slices, and their use in scenarios.


The above is the detailed content of Tips for working with slice capacity and length in Go. For more information, please follow other related articles on the PHP Chinese website!

Statement:
This article is reproduced at:linuxprobe.com. If there is any infringement, please contact admin@php.cn delete