Home > Article > Backend Development > Detailed analysis of garbage collection and memory management in PHP
This article brings you a detailed analysis of garbage collection and memory management in PHP. It has certain reference value. Friends in need can refer to it. I hope it will be helpful to you.
Reference counting
In PHP 5.2 and previous versions, PHP's garbage collection uses the reference counting algorithm.
Basic knowledge of reference counting
php variables are stored in the "zval" variable container (data structure), and the "zval" attribute contains the following information:
The data type of the current variable;
The value of the current variable;
is used to identify whether the variable is a reference The passed is_ref Boolean type identifier;
points to the refcount identifier of the number of variables in the "zval" variable container (that is, the number of times this zval has been referenced. Note that the reference here is not a reference. Pass value, pay attention to the distinction).
When a variable is assigned a value, a corresponding "zavl" variable container will be generated.
View variable zval container information
To view the "zval" container information of the variable (that is, view the is_ref and refcount of the variable), you can use the XDebug debugging toolxdebug_debug_zval() Function.
How to install the XDebug extension plug-in can be viewed in this tutorial. For how to use XDebug, please read the official documentation.
Assume that we have successfully installed the XDebug tool and can now debug variables.
View the zval information of ordinary variables
If our PHP statement is just a simple assignment of variables, the is_ref identifier value is 0, and the refcount The value is 1; if this variable is assigned as a value to another variable, the refcount count of the zval variable container is increased; similarly, when the variable is destroyed (unset), the "refcount" is subtracted by 1 accordingly.
Please see the following example:
<?php // 变量赋值时,refcount 值等于 1 $name = 'liugongzi'; xdebug_debug_zval('name'); // (refcount=1, is_ref=0)string 'liugongzi' (length=9) // $name 作为值赋值给另一个变量, refcount 值增加 1 $copy = $name; xdebug_debug_zval('name'); // (refcount=2, is_ref=0)string 'liugongzi' (length=9) // 销毁变量,refcount 值减掉 1 unset($copy); xdebug_debug_zval('name'); // (refcount=1, is_ref=0)string 'liugongzi' (length=9)
Copy on write
Copy on write (Copy On Write: COW), a simple description is: if a value is assigned to a variable by assignment, new memory will not be allocated to store the value saved by the new variable, but the memory will simply be shared through a counter, with only one reference pointing to it. When the value of a variable changes, new space is allocated to save the value content to reduce memory usage. - TPIP copy-on-write
Through the previous zval information of simple variables, we know that $copy and $name share the zval variable container (memory), and then use refcount to indicate how many variables are currently using this zval.
Look at an example:
<?php $name = 'liugongzi'; xdebug_debug_zval('name'); // name: (refcount=1, is_ref=0)string 'liugongzi' (length=9) $copy = $name; xdebug_debug_zval('name'); // name: (refcount=2, is_ref=0)string 'liugongzi' (length=9) // 将新的值赋值给变量 $copy $copy = 'liugongzi handsome'; xdebug_debug_zval('name'); // name: (refcount=1, is_ref=0)string 'liugongzi' (length=9) xdebug_debug_zval('copy'); // copy: (refcount=1, is_ref=0)='liugongzi handsome'
Did you notice that when the value liugongzi handsome is assigned to the variable $copy, the refcount values of name and copy become 1. In this process The following operations occur:
Separate $copy from the zval (inner slave) of $name (i.e. copy);
Subtract 1 from the refcount of $name;
Modify the zval of $copy (reassign and modify the refcount);
Here is just a brief introduction to "copy while writing". Interested friends can read the reference materials given at the end of the article for more in-depth research.
View the zval information of the variable passed by reference
The "reference counting" rules of reference pass-by-value (&) are the same as those of ordinary assignment statements, except The value of is_ref is 1, indicating that the variable is a reference pass-by-value type.
Let’s now look at an example of passing by reference:
<?php $age = 'liugongzi'; xdebug_debug_zval('age'); // (refcount=1, is_ref=0)string 'liugongzi' (length=9) $copy = &$age; xdebug_debug_zval('age'); // (refcount=2, is_ref=1)string 'liugongzi' (length=9) unset($copy); xdebug_debug_zval('age'); // (refcount=1, is_ref=1)string 'liugongzi' (length=9)
Reference counting of composite types
With scalar types (integer, floating point, Boolean, etc.), the reference counting rules for types such as arrays and objects are slightly more complicated.
For a better explanation, let’s first look at the reference counting example of an array:
$a = array( 'meaning' => 'life', 'number' => 42 ); xdebug_debug_zval( 'a' ); // a: // (refcount=1, is_ref=0) // array (size=2) // 'meaning' => (refcount=1, is_ref=0)string 'life' (length=4) // 'number' => (refcount=1, is_ref=0)int 42
The reference counting diagram above is as follows:
From the figure we find that the reference counting rules of composite types are basically the same as the counting rules of scalars. For the example given, PHP will create 3 zval variable containers, one for storing the array itself , the other two are used to store elements in the array.
When adding an existing element to the array, its reference counter refcount will be increased by 1.
$a = array( 'meaning' => 'life', 'number' => 42 ); xdebug_debug_zval( 'a' ); $a['life'] = $a['meaning']; xdebug_debug_zval( 'a' ); // a: // (refcount=1, is_ref=0) // array (size=3) // 'meaning' => (refcount=2, is_ref=0)string 'life' (length=4) // 'number' => (refcount=0, is_ref=0)int 42 // 'life' => (refcount=2, is_ref=0)string 'life' (length=4)
The rough diagram is as follows:
<?php // @link http://php.net/manual/zh/function.memory-get-usage.php#96280 function convert($size) { $unit=array('b','kb','mb','gb','tb','pb'); return @round($size/pow(1024,($i=floor(log($size,1024)))),2).' '.$unit[$i]; } // 注意:有用的地方从这里开始 $memory = memory_get_usage(); $a = array( 'one' ); // 引用自身(循环引用) $a[] =&$a; xdebug_debug_zval( 'a' ); var_dump(convert(memory_get_usage() - $memory)); // 296 b unset($a); // 删除变量 $a,由于 $a 中的元素引用了自身(循环引用)最终导致 $a 所使用的内存无法被回收 var_dump(convert(memory_get_usage() - $memory)); // 568 b
从内存占用结果上看,虽然我们执行了 unset($a) 方法来销毁 $a 数组,但内存并没有被回收,整个处理过程的示意图如下:
可以看到对于这块内存,再也没有符合表(变量)指向了,所以 PHP 无法完成内存回收,官方给出的解释如下:
尽管不再有某个作用域中的任何符号指向这个结构 (就是变量容器),由于数组元素 “1” 仍然指向数组本身,所以这个容器不能被清除 。因为没有另外的符号指向它,用户没有办法清除这个结构,结果就会导致内存泄漏。庆幸的是,php 将在脚本执行结束时清除这个数据结构,但是在 php 清除之前,将耗费不少内存。如果你要实现分析算法,或者要做其他像一个子元素指向它的父元素这样的事情,这种情况就会经常发生。当然,同样的情况也会发生在对象上,实际上对象更有可能出现这种情况,因为对象总是隐式的被引用。
简单来说就是「引用计数」算法无法检测并释放循环引用所使用的内存,最终导致内存泄露。
引用计数系统的同步周期回收
由于引用计数算法存在无法回收循环应用导致的内存泄露问题,在 PHP 5.3 之后对内存回收的实现做了优化,通过采用 引用计数系统的同步周期回收 算法实现内存管理。引用计数系统的同步周期回收算法是一个改良版本的引用计数算法,它在引用基础上做出了如下几个方面的增强:
引入了可能根(possible root)的概念:通过引用计数相关学习,我们知道如果一个变量(zval)被引用,要么是被全局符号表中的符号引用(即变量),要么被复杂类型(如数组)的 zval 中的符号(数组的元素)引用,那么这个 zval 变量容器就是「可能根」。
引入根缓冲区(root buffer)的概念:根缓冲区用于存放所有「可能根」,它是固定大小的,默认可存 10000 个可能根,如需修改可以通过修改 PHP 源码文件 Zend/zend_gc.c 中的常量 GC_ROOT_BUFFER_MAX_ENTRIES,再重新编译。
回收周期:当缓冲区满时,对缓冲区中的所有可能根进行垃圾回收处理。
下图(来自 PHP 手册),展示了新的回收算法执行过程:
引用计数系统的同步周期回收过程
缓冲区(紫色框部分,称为疑似垃圾),存储所有可能根(步骤 A);
采用深度优先算法遍历「根缓冲区」中所有的「可能根(即 zval 遍历容器)」,并对每个 zval 的 refcount 减 1,为了避免遍历时对同一个 zval 多次减 1(因为不同的根可能遍历到同一个 zval)将这个 zvel 标记为「已减」(步骤 B);
再次采用深度优先遍历算法遍历「可能根 zval」。当 zval 的 refcount 值不为 0 时,对其加 1,否则保持为 0。并请已遍历的 zval 变量容器标记为「已恢复」(即步骤 B 的逆运算)。那些 zval 的 refcount 值为 0 (蓝色框标记)的就是应该被回收的变量(步骤 C);
删除所有 refcount 为 0 的可能根(步骤 D)。
整个过程为:
采用深度优先算法执行:默认删除 > 模拟恢复 > 执行删除 达到内存回收的目的。
优化后的引用计数算法优势
将内存泄露控制在阀值内,这个由缓存区实现,达到缓冲区大小执行新一轮垃圾回收;
提升了垃圾回收性能,不是每次 refcount 减 1 都执行回收处理,而是等到根缓冲区满时才开始执行垃圾回收。
你可以从 PHP 手册 的回收周期 了解更多,也可以阅读文末给出的参考资料。
PHP 7 的内存管理
PHP 5 中 zval 实现上的主要问题:
zval 总是单独 从堆中分配内存;
zval 总是存储引用计数和循环回收 的信息,即使是整型(bool / null)这种可能并不需要此类信息的数据;
在使用对象或者资源时,直接引用会导致两次计数;
Some indirect accesses need a better way of handling. For example, four pointers are now indirectly used to access objects stored in variables (the length of the pointer chain is four);
direct counting means that values can only be shared between zvals . This doesn't work if you want to share a string between a zval and a hashtable key (unless the hashtable key is also a zval).
Adjustments to the zval data structure implementation in PHP 7:
The most basic change is that the memory required by zval is no longer allocated separately from the heap, and is no longer allocated by zval stores the reference count.
The reference count of complex data types (such as strings, arrays, and objects) is stored by itself.
Advantages of this implementation:
Simple data types do not need to allocate memory separately and do not need to be counted;
There will be no more double counting. In an object, only the count stored in the object itself is valid;
Since the count is now stored by the value itself (PHP has zval variable container storage), it can also be combined with non-zval structures Data sharing, such as between zval and hashtable key;
The number of pointers required for indirect access is reduced.
Recommended related articles:
Memory management and garbage collection of PHP scripts - Personal article thinking
Learn javascript’s garbage collection mechanism and memory management with me_javascript skills
Garbage collection Simple explanation of PHP garbage collection mechanism
The above is the detailed content of Detailed analysis of garbage collection and memory management in PHP. For more information, please follow other related articles on the PHP Chinese website!