Home  >  Article  >  Backend Development  >  The underlying implementation of Go function closure

The underlying implementation of Go function closure

Go语言进阶学习
Go语言进阶学习forward
2023-07-25 15:18:341261browse

Function closure is not an advanced vocabulary for most readers, so what is a closure? Here is an excerpt from the definition on Wiki:

a closure is a record storing a function together with an environment.The environment is a mapping associating each free variable of the function (variables that are used locally, but defined in an enclosing scope) with the value or reference to which the name was bound when the closure was created.

##简In short, a closure is an entity composed of a function and a reference environment. In the implementation process, closures are often implemented by calling external functions and returning their internal functions. Among them, the reference environment refers to the mapping of free variables in the external function (used by the internal function, but defined in the external function). The internal function introduces external free variables so that these variables will not be released or deleted even if they leave the environment of the external function. The returned internal function still holds this information.

The underlying implementation of Go function closure

This passage may not be easy to understand, so let’s just look at an example.

 1package main
 2
 3import "fmt"
 4
 5func outer() func() int {
 6    x := 1
 7    return func() int {
 8        x++
 9        return x
10    }
11}
12
13func main() {
14    closure := outer()
15    fmt.Println(closure())
16    fmt.Println(closure())
17}
18
19// output
202
213

As you can see, two features in Go (functions are first-class citizens and support for anonymous functions) make it easy to implement closures.

In the above example, <span style="font-size: 15px;letter-spacing: 1px;">closure</span> is the closure function, the variable x is the reference environment, and their combination is the closure entity. <span style="font-size: 15px;letter-spacing: 1px;">x</span>This is a local variable within the <span style="font-size: 15px;letter-spacing: 1px;">outer</span> function and outside the anonymous function . After the normal function call ends, <span style="font-size: 15px;letter-spacing: 1px;">x</span> will be destroyed as the function stack is destroyed. However, due to the reference of the anonymous function, the function object returned by <span style="font-size: 15px;letter-spacing: 1px;">outer</span> will always be held <span style="font-size: 15px;letter-spacing: 1px;">x</span>variable. This causes each time the closure <span style="font-size: 15px;letter-spacing: 1px;">closure</span> is called, the <span style="font-size: 15px;letter-spacing: 1px;">x</span> variable will be accumulated .

This is different from ordinary function calls: the local variables <span style="font-size: 15px;letter-spacing: 1px;">x</span># do not end with the function call. disappear. So, why is this?


实现原理

我们不妨从汇编入手,将上述代码稍微修改一下

 1package main
 2
 3func outer() func() int {
 4    x := 1
 5    return func() int {
 6        x++
 7        return x
 8    }
 9}
10
11func main() {
12    _ := outer()
13}

得到编译后的汇编语句如下。

 1$ go tool compile -S -N -l main.go 
 2"".outer STEXT size=181 args=0x8 locals=0x28
 3        0x0000 00000 (main.go:3)        TEXT    "".outer(SB), ABIInternal, $40-8
 4        ...
 5        0x0021 00033 (main.go:3)        MOVQ    $0, "".~r0+48(SP)
 6        0x002a 00042 (main.go:4)        LEAQ    type.int(SB), AX
 7        0x0031 00049 (main.go:4)        MOVQ    AX, (SP)
 8        0x0035 00053 (main.go:4)        PCDATA  $1, $0
 9        0x0035 00053 (main.go:4)        CALL    runtime.newobject(SB)
10        0x003a 00058 (main.go:4)        MOVQ    8(SP), AX
11        0x003f 00063 (main.go:4)        MOVQ    AX, "".&x+24(SP)
12        0x0044 00068 (main.go:4)        MOVQ    $1, (AX)
13        0x004b 00075 (main.go:5)        LEAQ    type.noalg.struct { F uintptr; "".x *int }(SB), AX
14        0x0052 00082 (main.go:5)        MOVQ    AX, (SP)
15        0x0056 00086 (main.go:5)        PCDATA  $1, $1
16        0x0056 00086 (main.go:5)        CALL    runtime.newobject(SB)
17        0x005b 00091 (main.go:5)        MOVQ    8(SP), AX
18        0x0060 00096 (main.go:5)        MOVQ    AX, ""..autotmp_4+16(SP)
19        0x0065 00101 (main.go:5)        LEAQ    "".outer.func1(SB), CX
20        0x006c 00108 (main.go:5)        MOVQ    CX, (AX)
21        ...

首先,我们发现不一样的是 <span style="font-size: 15px;letter-spacing: 1px;">x:=1</span> 会调用 <span style="font-size: 15px;letter-spacing: 1px;">runtime.newobject</span> 函数(内置<span style="font-size: 15px;letter-spacing: 1px;">new</span>函数的底层函数,它返回数据类型指针)。在正常函数局部变量的定义时,例如

 1package main
 2
 3func add() int {
 4    x := 100
 5    x++
 6    return x
 7}
 8
 9func main() {
10    _ = add()
11}

我们能发现 <span style="font-size: 15px;letter-spacing: 1px;">x:=100</span> 是不会调用 <span style="font-size: 15px;letter-spacing: 1px;">runtime.newobject</span> 函数的,它对应的汇编是如下

1"".add STEXT nosplit size=58 args=0x8 locals=0x10
2        0x0000 00000 (main.go:3)        TEXT    "".add(SB), NOSPLIT|ABIInternal, $16-8
3        ...
4        0x000e 00014 (main.go:3)        MOVQ    $0, "".~r0+24(SP)
5        0x0017 00023 (main.go:4)        MOVQ    $100, "".x(SP)  // x:=100
6        0x001f 00031 (main.go:5)        MOVQ    $101, "".x(SP)
7        0x0027 00039 (main.go:6)        MOVQ    $101, "".~r0+24(SP)
8        ...

留着疑问,继续往下看。我们发现有以下语句

1        0x004b 00075 (main.go:5)        LEAQ    type.noalg.struct { F uintptr; "".x *int }(SB), AX
2        0x0052 00082 (main.go:5)        MOVQ    AX, (SP)
3        0x0056 00086 (main.go:5)        PCDATA  $1, $1
4        0x0056 00086 (main.go:5)        CALL    runtime.newobject(SB)

我们看到 <span style="font-size: 15px;letter-spacing: 1px;">type.noalg.struct { F uintptr; "".x *int }(SB)</span>,这其实就是定义的一个闭包数据类型,它的结构表示如下

1type closure struct {
2    F uintptr   // 函数指针,代表着内部匿名函数
3    x *int      // 自由变量x,代表着对外部环境的引用
4}

之后,在通过 <span style="font-size: 15px;letter-spacing: 1px;">runtime.newobject</span> 函数创建了闭包对象。而且由于 <span style="font-size: 15px;letter-spacing: 1px;">LEAQ xxx yyy</span>代表的是将 <span style="font-size: 15px;letter-spacing: 1px;">xxx</span> 指针,传递给 <span style="font-size: 15px;letter-spacing: 1px;">yyy</span>,因此 <span style="font-size: 15px;letter-spacing: 1px;">outer</span> 函数最终的返回,其实是闭包结构体对象指针。在《详解逃逸分析》一文中,我们详细地描述了Go编译器的逃逸分析机制,对于这种函数返回暴露给外部指针的情况,很明显,闭包对象会被分配至堆上,变量x也会随着对象逃逸至堆。这就很好地解释了为什么<span style="font-size: 15px;letter-spacing: 1px;">x</span>变量没有随着函数栈的销毁而消亡。

我们可以通过逃逸分析来验证我们的结论

 1$  go build -gcflags &#39;-m -m -l&#39; main.go
 2# command-line-arguments
 3./main.go:6:3: outer.func1 capturing by ref: x (addr=true assign=true width=8)
 4./main.go:5:9: func literal escapes to heap:
 5./main.go:5:9:   flow: ~r0 = &{storage for func literal}:
 6./main.go:5:9:     from func literal (spill) at ./main.go:5:9
 7./main.go:5:9:     from return func literal (return) at ./main.go:5:2
 8./main.go:4:2: x escapes to heap:
 9./main.go:4:2:   flow: {storage for func literal} = &x:
10./main.go:4:2:     from func literal (captured by a closure) at ./main.go:5:9
11./main.go:4:2:     from x (reference) at ./main.go:6:3
12./main.go:4:2: moved to heap: x                   // 变量逃逸
13./main.go:5:9: func literal escapes to heap       // 函数逃逸

至此,我相信读者已经明白为什么闭包能持续持有外部变量的原因了。那么,我们来思考上文中留下的疑问,为什么在<span style="font-size: 15px;letter-spacing: 1px;">x:=1</span> 时会调用 <span style="font-size: 15px;letter-spacing: 1px;">runtime.newobject</span> 函数。

我们将上文中的例子改为如下,即删掉 <span style="font-size: 15px;letter-spacing: 1px;">x++</span> 代码

 1package main
 2
 3func outer() func() int {
 4    x := 1
 5    return func() int {
 6        return x
 7    }
 8}
 9
10func main() {
11    _ = outer()
12}

此时,<span style="font-size: 15px;letter-spacing: 1px;">x:=1</span>处的汇编代码,将不再调用 <span style="font-size: 15px;letter-spacing: 1px;">runtime.newobject</span> 函数,通过逃逸分析也会发现将<span style="font-size: 15px;letter-spacing: 1px;">x</span>不再逃逸,生成的闭包对象中的<span style="font-size: 15px;letter-spacing: 1px;">x</span>的将是值类型<span style="font-size: 15px;letter-spacing: 1px;">int</span>

1type closure struct {
2    F uintptr 
3    x int      
4}

这其实就是Go编译器做得精妙的地方:当闭包内没有对外部变量造成修改时,Go 编译器会将自由变量的引用传递优化为直接值传递,避免变量逃逸。


总结

函数闭包一点也不神秘,它就是函数和引用环境而组合的实体。在Go中,闭包在底层是一个结构体对象,它包含了函数指针与自由变量。

The escape analysis mechanism of the Go compiler will allocate the closure object to the heap, so that the free variable will not disappear when the function stack is destroyed, and it can always exist depending on the closure entity. Therefore, the advantages and disadvantages of using closures are obvious: closures can avoid using global variables and instead maintain free variables stored in memory for a long time; however, this implicit holding of free variables will cause problems when used improperly. It will easily cause memory waste and leakage.

In actual projects, there are not many usage scenarios for closures. Of course, if you write a closure in your code, for example, a callback function you write forms a closure, you need to be careful, otherwise the memory usage problem may cause you trouble.

The above is the detailed content of The underlying implementation of Go function closure. For more information, please follow other related articles on the PHP Chinese website!

Statement:
This article is reproduced at:Go语言进阶学习. If there is any infringement, please contact admin@php.cn delete