Pay attention to these two points when using Go defer!-Golang-php.cn

defer is a very interesting keyword feature in Go language. The example is as follows:

package main

import "fmt"

func main() {
    defer fmt.Println("煎鱼了")

    fmt.Println("脑子进")
}

The output result is:

脑子进
煎鱼了

A few days ago, some friends in my reader group discussed the following issue:

Pay attention to these two points when using Go defer!

To put it simply, the question is whether there will be any performance impact if using the defer keyword in the for loop?

Because in the design of the underlying data structure of the Go language, defer is a linked list data structure:

Pay attention to these two points when using Go defer!

Everyone is worried that if the loop is too large, the defer linked list will become huge. Not "excellent" enough. Or are you wondering whether the design of Go defer is similar to the Redis data structure design, and I have optimized it myself, but it actually has no big impact?

In today’s article, we will explore the loop Go defer. Will it cause any problems if the underlying linked list is too long? If so, what are the specific impacts?

Start the journey of attracting fish.

defer performance optimization by 30%

In the early years of Go1.13, we conducted a round of performance optimization on defer, which improved defer performance by 30% in most scenarios:

Pay attention to these two points when using Go defer!

Let’s review the changes in Go1.13 and see where Go defer has been optimized. This is the key point of the problem.

Comparison between before and now

In Go1.12 and before, the assembly code when calling Go defer is as follows:

    0x0070 00112 (main.go:6)    CALL    runtime.deferproc(SB)
    0x0075 00117 (main.go:6)    TESTL    AX, AX
    0x0077 00119 (main.go:6)    JNE    137
    0x0079 00121 (main.go:7)    XCHGL    AX, AX
    0x007a 00122 (main.go:7)    CALL    runtime.deferreturn(SB)
    0x007f 00127 (main.go:7)    MOVQ    56(SP), BP

In Go1.13 and later, the assembly code when calling Go defer The code is as follows:

    0x006e 00110 (main.go:4)    MOVQ    AX, (SP)
    0x0072 00114 (main.go:4)    CALL    runtime.deferprocStack(SB)
    0x0077 00119 (main.go:4)    TESTL    AX, AX
    0x0079 00121 (main.go:4)    JNE    139
    0x007b 00123 (main.go:7)    XCHGL    AX, AX
    0x007c 00124 (main.go:7)    CALL    runtime.deferreturn(SB)
    0x0081 00129 (main.go:7)    MOVQ    112(SP), BP

From the assembly point of view, it seems that the original method of calling runtime.deferproc has been changed to the method of calling runtime.deferprocStack. Is this done? What optimization?

Wehold our doubts and continue reading.

defer minimum unit: _defer

Compared with previous versions, the minimum unit of Go defer_defer structure mainly adds heap Field:

type _defer struct {
    siz     int32
    siz     int32 // includes both arguments and results
    started bool
    heap    bool
    sp      uintptr // sp at time of defer
    pc      uintptr
    fn      *funcval
    ...

This field is used to identify whether this _defer is allocated on the heap or the stack. The other fields have not been clearly changed, so we can focus on # The stack of ##defer is allocated, let’s see what is done.

deferprocStack

func deferprocStack(d *_defer) {
    gp := getg()
    if gp.m.curg != gp {
        throw("defer on system stack")
    }
    
    d.started = false
    d.heap = false
    d.sp = getcallersp()
    d.pc = getcallerpc()

    *(*uintptr)(unsafe.Pointer(&d._panic)) = 0
    *(*uintptr)(unsafe.Pointer(&d.link)) = uintptr(unsafe.Pointer(gp._defer))
    *(*uintptr)(unsafe.Pointer(&gp._defer)) = uintptr(unsafe.Pointer(d))

    return0()
}

This piece of code is quite conventional, mainly to obtain the function stack pointer of the calling

defer function, the specific address of the parameters passed into the function, and the PC (program counter ), this has been introduced in detail in the previous article "In-depth Understanding of Go Defer", so I won't go into details here.

What is so special about this

deferprocStack?

You can see that it sets

d.heap to false, which means that the deferprocStack method is for _defer The application scenario allocated on the stack.

deferproc

The question is, where does it handle the application scenarios allocated on the heap?

func newdefer(siz int32) *_defer {
    ...
    d.heap = true
    d.link = gp._defer
    gp._defer = d
    return d
}

The specific

newdefer is where it is called, as follows:

func deferproc(siz int32, fn *funcval) { // arguments of fn follow fn
    ...
    sp := getcallersp()
    argp := uintptr(unsafe.Pointer(&fn)) + unsafe.Sizeof(fn)
    callerpc := getcallerpc()

    d := newdefer(siz)
    ...
}

It is very clear that the

deferproc method called in the previous version, Now used to correspond to scenarios allocated on the heap.

Summary

deferproc has not been removed, but the process has been optimized.
deferproc or deferprocStack method according to the application scenario. They are respectively for the usage scenario of allocation on the heap and stack.

Where is the optimization?

The main optimization lies in the change of the stack allocation rules of its defer object. The measures are:

The compiler's ## of
defer #for-loop Iterate deeply for analysis. <pre class="brush:php;toolbar:false">// src/cmd/compile/internal/gc/esc.go case ODEFER: if e.loopdepth == 1 { // top level n.Esc = EscNever // force stack allocation of defer record (see ssa.go) break }</pre>If the Go compiler detects that the loop depth (loopdepth) is 1, it sets the result of escape analysis and will be allocated on the stack, otherwise it will be allocated on the heap.

// src/cmd/compile/internal/gc/ssa.go
case ODEFER:
    d := callDefer
    if n.Esc == EscNever {
        d = callDeferStack
    }
    s.call(n.Left, d)

This eliminates the large amount of performance overhead caused by frequent calls to

systemstack

, mallocgc and other methods in the past, thereby improving performance in most scenarios. Loop call defer

Back to the problem itself, after knowing the principle of defer optimization. Then "Will the defer keyword in the loop have any performance impact?"

The most direct impact is that about 30% of the performance optimization is completely lost, and due to incorrect posture, theoretically defer has The overhead (the linked list becomes longer) also becomes larger and the performance becomes worse.

So we need to avoid the following two scenarios:

显式循环：在调用 defer 关键字的外层有显式的循环调用，例如：for-loop 语句等。
隐式循环：在调用 defer 关键字有类似循环嵌套的逻辑，例如：goto 语句等。

显式循环

第一个例子是直接在代码的 for 循环中使用 defer 关键字：

func main() {
    for i := 0; i <p>这个也是最常见的模式，无论是写爬虫时，又或是 Goroutine 调用时，不少人都喜欢这么写。</p><p>这属于显式的调用了循环。</p><h3 id="隐式循环">隐式循环</h3><p>第二个例子是在代码中使用类似 <code>goto</code> 关键字：</p><pre class="brush:php;toolbar:false">func main() {
    i := 1
food:
    defer func() {}()
    if i == 1 {
        i -= 1
        goto food
    }
}

这种写法比较少见，因为 goto 关键字有时候甚至会被列为代码规范不给使用，主要是会造成一些滥用，所以大多数就选择其实方式实现逻辑。

这属于隐式的调用，造成了类循环的作用。

总结

显然，Defer 在设计上并没有说做的特别的奇妙。他主要是根据实际的一些应用场景进行了优化，达到了较好的性能。

虽然本身 defer 会带一点点开销，但并没有想象中那么的不堪使用。除非你 defer 所在的代码是需要频繁执行的代码，才需要考虑去做优化。

否则没有必要过度纠结，在实际上，猜测或遇到性能问题时，看看 PProf 的分析，看看 defer 是不是在相应的 hot path 之中，再进行合理优化就好。

所谓的优化，可能也只是去掉 defer 而采用手动执行，并不复杂。在编码时避免踩到 defer 的显式和隐式循环这 2 个雷区就可以达到性能最大化了。

更多golang相关技术文章，请访问golang教程栏目！

The above is the detailed content of Pay attention to these two points when using Go defer!. For more information, please follow other related articles on the PHP Chinese website!