defer is a very interesting keyword feature in Go language. The example is as follows:
package main import "fmt" func main() { defer fmt.Println("煎鱼了") fmt.Println("脑子进") }
The output result is:
脑子进 煎鱼了
A few days ago, some friends in my reader group discussed the following issue:
To put it simply, the question is whether there will be any performance impact if using the defer keyword in the
for loop?
Because in the design of the underlying data structure of the Go language, defer is a linked list data structure:
Everyone is worried that if the loop is too large, the defer linked list will become huge. Not "excellent" enough. Or are you wondering whether the design of Go defer is similar to the Redis data structure design, and I have optimized it myself, but it actually has no big impact?
In today’s article, we will explore the loop Go defer. Will it cause any problems if the underlying linked list is too long? If so, what are the specific impacts?
Start the journey of attracting fish.
defer performance optimization by 30%
In the early years of Go1.13, we conducted a round of performance optimization on defer, which improved defer performance by 30% in most scenarios:
Let’s review the changes in Go1.13 and see where Go defer has been optimized. This is the key point of the problem.
Comparison between before and now
In Go1.12 and before, the assembly code when calling Go defer is as follows:
0x0070 00112 (main.go:6) CALL runtime.deferproc(SB) 0x0075 00117 (main.go:6) TESTL AX, AX 0x0077 00119 (main.go:6) JNE 137 0x0079 00121 (main.go:7) XCHGL AX, AX 0x007a 00122 (main.go:7) CALL runtime.deferreturn(SB) 0x007f 00127 (main.go:7) MOVQ 56(SP), BP
In Go1.13 and later, the assembly code when calling Go defer The code is as follows:
0x006e 00110 (main.go:4) MOVQ AX, (SP) 0x0072 00114 (main.go:4) CALL runtime.deferprocStack(SB) 0x0077 00119 (main.go:4) TESTL AX, AX 0x0079 00121 (main.go:4) JNE 139 0x007b 00123 (main.go:7) XCHGL AX, AX 0x007c 00124 (main.go:7) CALL runtime.deferreturn(SB) 0x0081 00129 (main.go:7) MOVQ 112(SP), BP
From the assembly point of view, it seems that the original method of calling runtime.deferproc
has been changed to the method of calling runtime.deferprocStack
. Is this done? What optimization?
Wehold our doubts and continue reading.
defer minimum unit: _defer
Compared with previous versions, the minimum unit of Go defer_defer
structure mainly adds heap
Field:
type _defer struct { siz int32 siz int32 // includes both arguments and results started bool heap bool sp uintptr // sp at time of defer pc uintptr fn *funcval ...
This field is used to identify whether this _defer
is allocated on the heap or the stack. The other fields have not been clearly changed, so we can focus on # The stack of ##defer is allocated, let’s see what is done.
func deferprocStack(d *_defer) {
gp := getg()
if gp.m.curg != gp {
throw("defer on system stack")
}
d.started = false
d.heap = false
d.sp = getcallersp()
d.pc = getcallerpc()
*(*uintptr)(unsafe.Pointer(&d._panic)) = 0
*(*uintptr)(unsafe.Pointer(&d.link)) = uintptr(unsafe.Pointer(gp._defer))
*(*uintptr)(unsafe.Pointer(&gp._defer)) = uintptr(unsafe.Pointer(d))
return0()
}
This piece of code is quite conventional, mainly to obtain the function stack pointer of the calling defer function, the specific address of the parameters passed into the function, and the PC (program counter ), this has been introduced in detail in the previous article "In-depth Understanding of Go Defer", so I won't go into details here.
deferprocStack?
d.heap to
false, which means that the
deferprocStack method is for
_defer The application scenario allocated on the stack.
func newdefer(siz int32) *_defer { ... d.heap = true d.link = gp._defer gp._defer = d return d }The specific
newdefer is where it is called, as follows:
func deferproc(siz int32, fn *funcval) { // arguments of fn follow fn ... sp := getcallersp() argp := uintptr(unsafe.Pointer(&fn)) + unsafe.Sizeof(fn) callerpc := getcallerpc() d := newdefer(siz) ... }It is very clear that the
deferproc method called in the previous version, Now used to correspond to scenarios allocated on the heap.
- What is certain is that
- deferproc
has not been removed, but the process has been optimized.
The Go compiler will choose to use the - deferproc
or
deferprocStackmethod according to the application scenario. They are respectively for the usage scenario of allocation on the heap and stack.
The compiler's ## of
defer #for-loop
Iterate deeply for analysis. <pre class="brush:php;toolbar:false">// src/cmd/compile/internal/gc/esc.go
case ODEFER:
if e.loopdepth == 1 { // top level
n.Esc = EscNever // force stack allocation of defer record (see ssa.go)
break
}</pre>
If the Go compiler detects that the loop depth (loopdepth) is 1, it sets the result of escape analysis and will be allocated on the stack, otherwise it will be allocated on the heap.
// src/cmd/compile/internal/gc/ssa.go case ODEFER: d := callDefer if n.Esc == EscNever { d = callDeferStack } s.call(n.Left, d)
This eliminates the large amount of performance overhead caused by frequent calls to
systemstack, mallocgc
and other methods in the past, thereby improving performance in most scenarios. Loop call defer
Back to the problem itself, after knowing the principle of defer optimization. Then "Will the defer keyword in the loop have any performance impact?"
The most direct impact is that about 30% of the performance optimization is completely lost, and due to incorrect posture, theoretically defer has The overhead (the linked list becomes longer) also becomes larger and the performance becomes worse.
So we need to avoid the following two scenarios:
- 显式循环:在调用 defer 关键字的外层有显式的循环调用,例如:
for-loop
语句等。 - 隐式循环:在调用 defer 关键字有类似循环嵌套的逻辑,例如:
goto
语句等。
显式循环
第一个例子是直接在代码的 for
循环中使用 defer 关键字:
func main() { for i := 0; i <p>这个也是最常见的模式,无论是写爬虫时,又或是 Goroutine 调用时,不少人都喜欢这么写。</p><p>这属于显式的调用了循环。</p><h3>隐式循环</h3><p>第二个例子是在代码中使用类似 <code>goto</code> 关键字:</p><pre class="brush:php;toolbar:false">func main() { i := 1 food: defer func() {}() if i == 1 { i -= 1 goto food } }
这种写法比较少见,因为 goto
关键字有时候甚至会被列为代码规范不给使用,主要是会造成一些滥用,所以大多数就选择其实方式实现逻辑。
这属于隐式的调用,造成了类循环的作用。
总结
显然,Defer 在设计上并没有说做的特别的奇妙。他主要是根据实际的一些应用场景进行了优化,达到了较好的性能。
虽然本身 defer 会带一点点开销,但并没有想象中那么的不堪使用。除非你 defer 所在的代码是需要频繁执行的代码,才需要考虑去做优化。
否则没有必要过度纠结,在实际上,猜测或遇到性能问题时,看看 PProf 的分析,看看 defer 是不是在相应的 hot path 之中,再进行合理优化就好。
所谓的优化,可能也只是去掉 defer 而采用手动执行,并不复杂。在编码时避免踩到 defer 的显式和隐式循环这 2 个雷区就可以达到性能最大化了。
更多golang相关技术文章,请访问golang教程栏目!