Home  >  Article  >  Backend Development  >  PHP7 garbage collection mechanism (GC) analysis

PHP7 garbage collection mechanism (GC) analysis

Guanhui
Guanhuiforward
2020-05-20 11:17:575659browse

PHP7 garbage collection mechanism (GC) analysis

Garbage collection mechanism

The garbage collection mechanism is a dynamic storage allocation scheme. It automatically releases allocated memory blocks that are no longer needed by the program. The process of automatically reclaiming memory is called garbage collection. The garbage collection mechanism allows programmers not to worry too much about program memory allocation, so that they can devote more energy to business logic. Among the various popular languages ​​today, the garbage collection mechanism is a common feature of the new generation of languages.

Generation of garbage

Complex types in PHP7, such as strings, arrays, objects, etc., have a gc in the header. The role of this gc It is used to support garbage collection. When a variable is assigned or transferred, the reference number of value will be increased. When the variable is released by unset, return, etc., the reference number will be subtracted. After subtraction, if the refcount is found to become 0, the value will be released directly. This is the basic recycling process of variables.

However, there is one problem that this mechanism cannot solve, which is the problem of circular references.

What is a circular reference? Simply put, the value stored inside the variable refers to the variable itself. This comparison often occurs with variables of array and object types.

Let’s talk about references first, that is, the zend_reference type. This is a new variable type in PHP7. When the "&" operation is used on a variable, a new intermediate structure zend_reference will be created. This structure will actually Points to the corresponding value structure.

For example:

// 当进行如下赋值操作时
$a = 'hello'; // $a -> zend_string
$b = $a; // $b,$a -> zend_string
$c = &$b; // $c,$b -> zval(type = IS_REFERENCE, refcount = 2) -> zend_string

will eventually become as follows:

PHP7 garbage collection mechanism (GC) analysis

That is, the zval of $b and $c passes through the middle The structure zend_reference then points to the final zend_string.

Back to the issue of circular references, here is an example of array circular references:

$a = [1];
$a[] = &$a;
unset($a);

After using the & operation, variable a becomes a reference type and the reference count refcount is 2, and a value is assigned The element inside itself, that is, the variable a becomes a reference to itself.

The details are as follows:

PHP7 garbage collection mechanism (GC) analysis

After unset, it will become like the picture below:

PHP7 garbage collection mechanism (GC) analysis

That is, the zval type where $a is located has become IS_UNDEF. The reference count of the zend_reference structure is reduced by 1, but is still greater than 0. At this time, this part of the structure becomes garbage. If this is not processed, it will May cause memory leak. Here you need the garbage collector to collect this part into the buffer and then recycle it.

Recycling process

If the refcount of a variable is greater than 0 when it is reduced, PHP will not immediately perform garbage identification and recycling on this variable, but will put a In the buffer, after the buffer is full (10000 values), it will be processed uniformly. What is added to the buffer is the gc in the variable zend_value. Currently, garbage will only appear in two types: arrays and objects. In the case of arrays, As has been introduced, in the case of objects, the member attributes refer to the object itself. In other types, the members in the variables refer to the variables themselves will not occur, so garbage collection will only process these two types of variables.

The structure of gc zend_refcounted_h is as follows:

typedef struct _zend_refcounted_h {
    uint32_t         refcount; // 记录 zend_value 的引用数
    union {
        struct {
            zend_uchar    type,  // zend_value的类型, 与zval.u1.type一致
            zend_uchar    flags, 
            uint16_t      gc_info // GC信息,记录在 gc 池中的位置和颜色,垃圾回收的过程会用到
        } v;
        uint32_t type_info;
    } u;
} zend_refcounted_h;

A variable can only be added to the buffer once. In order to prevent repeated additions, zend_refcounted_h.gc_info will be set to GC_PURPLE after the variable is added, which is marked purple , will not be inserted repeatedly in the future.

The garbage buffer is a two-way linked list. When the buffer is full, the garbage checking process will be started: traverse the buffer, traverse all members of the current variable, and then reduce the refcount of the member by 1 (if the member is still If it contains sub-members, it will also be traversed recursively, that is, depth-first traversal), and finally the reference of the current variable will be checked. If it is reduced to 0, it is garbage. The core principle of this algorithm is: garbage is caused by members referring to themselves, then reduce the references to all members. If it is found that the refcount of the final variable itself becomes 0, it means that all its references come from its own members, that is, anywhere else. If you no longer use it, then it is garbage and needs to be recycled. Otherwise, it means it is not garbage and needs to be removed from the buffer. The specific process is as follows:

(1) Start traversing from the roots of the buffer linked list, mark the current value as gray (zend_refcounted_h.gc_info is set to GC_GREY), then perform a depth-first traversal of the members of the current value, and The refcount of member value is reduced by 1 and is also marked in gray;

(2) Repeatedly traverse the buffer linked list and check whether the current value reference is 0. If it is 0, it means it is indeed garbage. Mark it as white (GC_WHITE). If it is not 0, it excludes all references from its own members. Possibly, it means that there are external references and it is not garbage. At this time, because the refcount of the members is reduced by 1 in step (1), it needs to be restored again, a deep traversal of all members is performed, the member refcount is increased by 1, and marked as Black;

(3) Traverse the buffer linked list again and remove non-GC_WHITE nodes from the roots linked list. Finally, all the roots linked list is real garbage, and finally the garbage is cleared.

Recommended tutorials: "PHP7" "PHP Tutorial"

The above is the detailed content of PHP7 garbage collection mechanism (GC) analysis. For more information, please follow other related articles on the PHP Chinese website!

Statement:
This article is reproduced at:learnku.com. If there is any infringement, please contact admin@php.cn delete