Compared with c/c, a big improvement of golang is the introduction of gc mechanism, which no longer requires users to manage memory by themselves , greatly reducing the bugs introduced by the program due to memory leaks, but at the same time gc also brings additional performance overhead, and sometimes even causes gc to become a performance bottleneck due to improper use. Therefore, when designing golang programs, special attention should be paid to the object Reuse to reduce pressure on gc. Slice and string are the basic types of golang. Understanding the internal mechanisms of these basic types will help us better reuse these objects
The internal structure of slice and string
The internal structure of slice and string The structure can be found in $GOROOT/src/reflect/value.go
type StringHeader struct { Data uintptr Len int } type SliceHeader struct { Data uintptr Len int Cap int }
You can see that a string contains a data pointer and a length, and the length is immutable
slice contains a data pointer, a length and a capacity. When the capacity is not enough, new memory will be re-applied. The Data pointer will point to the new address and the original address space will be released.
From these structures It can be seen that the assignment of string and slice, including passing it as a parameter, is just a shallow copy of the Data pointer like the custom structure
slice reuse
append operation
si1 := []int{1, 2, 3, 4, 5, 6, 7, 8, 9} si2 := si1 si2 = append(si2, 0) Convey("重新分配内存", func() { header1 := (*reflect.SliceHeader)(unsafe.Pointer(&si1)) header2 := (*reflect.SliceHeader)(unsafe.Pointer(&si2)) fmt.Println(header1.Data) fmt.Println(header2.Data) So(header1.Data, ShouldNotEqual, header2.Data) })
si1 and si2 both point to the same array at first. When the append operation is performed on si2, because the original Cap value is not enough, new space needs to be reapplied, so the Data value changes. In $GOROOT /src/reflect/value.go
This file also contains strategies for new cap values. In the function grow
, when the cap is less than 1024, it will grow exponentially, exceeding , each time it increases by 25%, and this memory growth not only consumes additional performance for data copying (copying from the old address to the new address), the release of the old address memory will also cause additional burden on gc, so If you can know the length of the data, try to use make([]int, len, cap)
to pre-allocate memory. If you don’t know the length, you can consider the following memory reuse method
Memory reuse
si1 := []int{1, 2, 3, 4, 5, 6, 7, 8, 9} si2 := si1[:7] Convey("不重新分配内存", func() { header1 := (*reflect.SliceHeader)(unsafe.Pointer(&si1)) header2 := (*reflect.SliceHeader)(unsafe.Pointer(&si2)) fmt.Println(header1.Data) fmt.Println(header2.Data) So(header1.Data, ShouldEqual, header2.Data) }) Convey("往切片里面 append 一个值", func() { si2 = append(si2, 10) Convey("改变了原 slice 的值", func() { header1 := (*reflect.SliceHeader)(unsafe.Pointer(&si1)) header2 := (*reflect.SliceHeader)(unsafe.Pointer(&si2)) fmt.Println(header1.Data) fmt.Println(header2.Data) So(header1.Data, ShouldEqual, header2.Data) So(si1[7], ShouldEqual, 10) }) })
si2 is a slice of si1. From the first piece of code, you can see that the slice does not reallocate memory. The Data pointers of si2 and si1 point to the same slice address, while the second piece of code It can be seen that when we append a new value to si2, we find that there is still no memory allocation, and this operation causes the value of si1 to also change, because both point to the same Data area. Use this feature , we only need to let si1 = si1[:0]
to continuously clear the contents of si1 and realize memory reuse
PS: You can use copy(si2, si1)
Implement deep copy
string
Convey("字符串常量", func() { str1 := "hello world" str2 := "hello world" Convey("地址相同", func() { header1 := (*reflect.StringHeader)(unsafe.Pointer(&str1)) header2 := (*reflect.StringHeader)(unsafe.Pointer(&str2)) fmt.Println(header1.Data) fmt.Println(header2.Data) So(header1.Data, ShouldEqual, header2.Data) }) })
This example is relatively simple. The string constants use the same address area
Convey("相同字符串的不同子串", func() { str1 := "hello world"[:6] str2 := "hello world"[:5] Convey("地址相同", func() { header1 := (*reflect.StringHeader)(unsafe.Pointer(&str1)) header2 := (*reflect.StringHeader)(unsafe.Pointer(&str2)) fmt.Println(header1.Data, str1) fmt.Println(header2.Data, str2) So(str1, ShouldNotEqual, str2) So(header1.Data, ShouldEqual, header2.Data) }) })
Different substrings of the same string will not apply for additional new memory, but it should be noted that the same string here refers to str1.Data == str2.Data && str1.Len == str2. Len
, instead of str1 == str2
, the following example can illustrate str1 == str2
but its Data is not the same
Convey("不同字符串的相同子串", func() { str1 := "hello world"[:5] str2 := "hello golang"[:5] Convey("地址不同", func() { header1 := (*reflect.StringHeader)(unsafe.Pointer(&str1)) header2 := (*reflect.StringHeader)(unsafe.Pointer(&str2)) fmt.Println(header1.Data, str1) fmt.Println(header2.Data, str2) So(str1, ShouldEqual, str2) So(header1.Data, ShouldNotEqual, header2.Data) }) })
actually for characters String, you just need to remember one thing, string is immutable, any string operation will not apply for additional memory (for only internal data pointers), I once cleverly designed a cache to store strings , to reduce the space occupied by repeated strings. In fact, unless the string itself is created from []byte
, otherwise, the string itself is a substring of another string (such as Strings obtained through strings.Split
) will not apply for additional space. Doing so is simply unnecessary.
For more golang related technical articles, please visit the golang tutorial column!