Home  >  Article  >  Backend Development  >  A brief discussion of PHP source code 34: PHP5.3’s newly added garbage collection mechanism (Garbage Collection)

A brief discussion of PHP source code 34: PHP5.3’s newly added garbage collection mechanism (Garbage Collection)

不言
不言Original
2018-06-29 10:07:011027browse

This article mainly introduces about PHP source code 34: PHP5.3’s newly added garbage collection mechanism (Garbage Collection), which has a certain reference value. Now I share it with you. Friends in need can refer to it.

A brief discussion of PHP source code thirty-four: PHP5.3’s new garbage collection mechanism (Garbage Collection)
In the previous article, a brief discussion of PHP source code thirty-three: PHP5.3’s new addition The Basics of Garbage Collection introduces some basic knowledge of the garbage collection mechanism. Today we look at its initialization, adding to the garbage buffer and garbage collection process.
For the official documentation, please click Garbage Collection
Chinese version address: http://docs.php.net/manual/zh/features.gc.php
[Initialization]
In zend/zend_gc .c Line 121 has the function gc_init that implements the initialization of gc. The code is as follows:

 ZEND_API void gc_init(TSRMLS_D){
if (GC_G(buf) == NULL && GC_G(gc_enabled)) {
GC_G(buf) = (gc_root_buffer*) malloc(sizeof(gc_root_buffer) * GC_ROOT_BUFFER_MAX_ENTRIES);
GC_G(last_unused) = &GC_G(buf)[GC_ROOT_BUFFER_MAX_ENTRIES];
gc_reset(TSRMLS_C);
}}

Line 123 determines whether it is empty and whether gc is enabled. If both are true, go to line 124
Line 124 directly calls malloc to allocate 10,000 gc_root_buffer memory.
Line 125 sets the global variable last_unused to the end position of the gc buffer.
Line 126 resets the entire garbage collection mechanism. The code starts from line 88 of zend/zend_gc.c, as follows:

ZEND_API void gc_reset(TSRMLS_D){
GC_G(gc_runs) = 0;
GC_G(collected) = 0; #if GC_BENCH
GC_G(root_buf_length) = 0;
GC_G(root_buf_peak) = 0;
GC_G(zval_possible_root) = 0;
GC_G(zobj_possible_root) = 0;
GC_G(zval_buffered) = 0;
GC_G(zobj_buffered) = 0;
GC_G(zval_remove_from_buffer) = 0;
GC_G(zobj_remove_from_buffer) = 0;
GC_G(zval_marked_grey) = 0;
GC_G(zobj_marked_grey) = 0;#endif 
GC_G(roots).next = &GC_G(roots);
GC_G(roots).prev = &GC_G(roots); if (GC_G(buf)) {
GC_G(unused) = NULL;
GC_G(first_unused) = GC_G(buf); 
GC_G(zval_to_free) = NULL;
} else {
GC_G(unused) = NULL;
GC_G(first_unused) = NULL;
GC_G(last_unused) = NULL;
}}

Lines 90~91 set the statistics of the number of times gc runs (gc_runs) and gc The number of garbage in (collected) is 0.
Lines 106~107 set the previous node and next node of the head node of the doubly linked list to point to itself.

About gc_enabled, it is enabled by default and can be configured in php.ini.
The implementation code is as follows in line 93 of zend/zend.c:

STD_ZEND_INI_BOOLEAN("zend.enable_gc","1",ZEND_INI_ALL,OnUpdateGCEnabled,   gc_enabled, zend_gc_globals,        gc_globals)

The initialization call is in line 79 of zend/zend.c

 static ZEND_INI_MH(OnUpdateGCEnabled) /* {{{ */{
OnUpdateBool(entry, new_value, new_value_length, mh_arg1, mh_arg2, mh_arg3, stage TSRMLS_CC); if (GC_G(gc_enabled)) {
gc_init(TSRMLS_C);
} return SUCCESS;}

[Add to garbage buffer]
Track PHP source code zend/zend_execute_API.c 424 lines
[_zval_ptr_dtor] -> [GC_ZVAL_CHECK_POSSIBLE_ROOT()] -> [gc_zval_check_possible_root()] -> [gc_zval_possible_root()]
Among them, gc_zval_check_possible_root( ) function , only perform garbage collection operations on arrays and objects

The code of the gc_zval_possible_root function is as follows:

ZEND_API void gc_zval_possible_root(zval *zv TSRMLS_DC){
if (UNEXPECTED(GC_G(free_list) != NULL &&
               GC_ZVAL_ADDRESS(zv) != NULL &&
           GC_ZVAL_GET_COLOR(zv) == GC_BLACK) &&
           (GC_ZVAL_ADDRESS(zv) < GC_G(buf) ||
            GC_ZVAL_ADDRESS(zv) >= GC_G(last_unused))) {
/* The given zval is a garbage that is going to be deleted by
 * currently running GC */
return;
} if (zv->type == IS_OBJECT) {
GC_ZOBJ_CHECK_POSSIBLE_ROOT(zv);
return;
} 
GC_BENCH_INC(zval_possible_root); if (GC_ZVAL_GET_COLOR(zv) != GC_PURPLE) {
GC_ZVAL_SET_PURPLE(zv); if (!GC_ZVAL_ADDRESS(zv)) {
gc_root_buffer *newRoot = GC_G(unused); if (newRoot) {
GC_G(unused) = newRoot->prev;
} else if (GC_G(first_unused) != GC_G(last_unused)) {
newRoot = GC_G(first_unused);
GC_G(first_unused)++;
} else {
if (!GC_G(gc_enabled)) {
GC_ZVAL_SET_BLACK(zv);
return;
}
zv->refcount__gc++;
gc_collect_cycles(TSRMLS_C);
zv->refcount__gc--;
newRoot = GC_G(unused);
if (!newRoot) {
return;
}
GC_ZVAL_SET_PURPLE(zv);
GC_G(unused) = newRoot->prev;
} 
newRoot->next = GC_G(roots).next;
newRoot->prev = &GC_G(roots);
GC_G(roots).next->prev = newRoot;
GC_G(roots).next = newRoot; 
GC_ZVAL_SET_ADDRESS(zv, newRoot); 
newRoot->handle = 0;
newRoot->u.pz = zv; 
GC_BENCH_INC(zval_buffered);
GC_BENCH_INC(root_buf_length);
GC_BENCH_PEAK(root_buf_peak, root_buf_length);
}
}}

Lines 132~140 check whether the zval node information has been put into the node buffer, If it has been put into the node buffer, return directly, which can optimize its performance

Lines 142 to 145 process the object node and return directly without performing subsequent operations

Line 149 determines whether the node has been marked purple. If it is purple, it will no longer be added to the node buffer. This is to ensure that a node is only added to the buffer once.

Line 150 marks the color of the node as purple, indicating that this node has been added to the buffer, and there is no need to add it next time

Lines 153~157 find the new node If the buffer is full, a garbage collection operation is performed.

Lines 176~184 add new nodes to the doubly linked list where the buffer is located.

[Garbage collection process]
In the gc_zval_possible_root function, when the buffer is full, the program calls the gc_collect_cycles function to perform garbage collection operations. Starting from line 615 of the zend/zend_gc.c file, the implementation code is as follows:

 ZEND_API int gc_collect_cycles(TSRMLS_D){
int count = 0; if (GC_G(roots).next != &GC_G(roots)) {
zval_gc_info *p, *q, *orig_free_list, *orig_next_to_free; if (GC_G(gc_active)) {
return 0;
}
GC_G(gc_runs)++;
GC_G(zval_to_free) = FREE_LIST_END;
GC_G(gc_active) = 1;
gc_mark_roots(TSRMLS_C);
gc_scan_roots(TSRMLS_C);
gc_collect_roots(TSRMLS_C); 
orig_free_list = GC_G(free_list);
orig_next_to_free = GC_G(next_to_free);
p = GC_G(free_list) = GC_G(zval_to_free);
GC_G(zval_to_free) = NULL;
GC_G(gc_active) = 0; /* First call destructors */
while (p != FREE_LIST_END) {
if (Z_TYPE(p->z) == IS_OBJECT) {
if (EG(objects_store).object_buckets &&
EG(objects_store).object_buckets[Z_OBJ_HANDLE(p->z)].valid &&
EG(objects_store).object_buckets[Z_OBJ_HANDLE(p->z)].bucket.obj.refcount <= 0 &&
EG(objects_store).object_buckets[Z_OBJ_HANDLE(p->z)].bucket.obj.dtor &&
!EG(objects_store).object_buckets[Z_OBJ_HANDLE(p->z)].destructor_called) { 
EG(objects_store).object_buckets[Z_OBJ_HANDLE(p->z)].destructor_called = 1;
EG(objects_store).object_buckets[Z_OBJ_HANDLE(p->z)].bucket.obj.refcount++;
EG(objects_store).object_buckets[Z_OBJ_HANDLE(p->z)].bucket.obj.dtor(EG(objects_store).object_buckets[Z_OBJ_HANDLE(p->z)].bucket.obj.object, Z_OBJ_HANDLE(p->z) TSRMLS_CC);
EG(objects_store).object_buckets[Z_OBJ_HANDLE(p->z)].bucket.obj.refcount--;
}
}
count++;
p = p->u.next;
} /* Destroy zvals */
p = GC_G(free_list);
while (p != FREE_LIST_END) {
GC_G(next_to_free) = p->u.next;
if (Z_TYPE(p->z) == IS_OBJECT) {
if (EG(objects_store).object_buckets &&
EG(objects_store).object_buckets[Z_OBJ_HANDLE(p->z)].valid &&
EG(objects_store).object_buckets[Z_OBJ_HANDLE(p->z)].bucket.obj.refcount <= 0) {
EG(objects_store).object_buckets[Z_OBJ_HANDLE(p->z)].bucket.obj.refcount = 1;
Z_TYPE(p->z) = IS_NULL;
zend_objects_store_del_ref_by_handle_ex(Z_OBJ_HANDLE(p->z), Z_OBJ_HT(p->z) TSRMLS_CC);
}
} else if (Z_TYPE(p->z) == IS_ARRAY) {
Z_TYPE(p->z) = IS_NULL;
zend_hash_destroy(Z_ARRVAL(p->z));
FREE_HASHTABLE(Z_ARRVAL(p->z));
} else {
zval_dtor(&p->z);
Z_TYPE(p->z) = IS_NULL;
}
p = GC_G(next_to_free);
} /* Free zvals */
p = GC_G(free_list);
while (p != FREE_LIST_END) {
q = p->u.next;
FREE_ZVAL_EX(&p->z);
p = q;
}
GC_G(collected) += count;
GC_G(free_list) = orig_free_list;
GC_G(next_to_free) = orig_next_to_free;
} return count;}

Line 619 determines whether the buffer is empty. If it is empty, the garbage collection operation will not be performed.
Line 622 determines Whether the garbage collection operation is carried out regularly, if it is in progress, return directly
Lines 625 to 627 add 1 to the number of garbage collection operations, initialize the free list, and set gc_active to 1 to indicate that garbage return is in progress
Line 628 here As step B of the algorithm in its official document, the algorithm uses depth-first search to find all possible roots. After finding it, the reference count in each variable container is decremented by 1″ to ensure that the same variable container is not decremented twice. ″, marked with gray that has been reduced by 1.
Line 629 This is step C of the algorithm. The algorithm once again uses a depth-first search for each root node and checks the reference count of each variable container. If the reference count is 0, the variable container is marked with white. If the number of references is greater than 0, the operation of using depth-first search to decrement the reference count by 1 at this point is resumed (that is, the reference count is increased by 1), and then they are marked with black again.
The last step D of the algorithm in line 630, the algorithm traverses the root buffer to remove the variable container roots (zval roots) from there. At the same time, it checks whether there are variable containers that were white-marked in the previous step. Each white-marked The variable containers are all cleared.
In [gc_collect_cycles() -> gc_collect_roots() -> zval_collect_white() ] we can see that the white-marked nodes will be added to the global variable zval_to_free list. This The list will be used in subsequent operations.
Lines 632~633 store the global variables free_list and next_to_free in corresponding temporary variables, and will be restored to their current state at the end.
Lines 634~635 are initialized The list that needs to be cleared, the zval list to be cleared and the garbage collection operation status is inactive.
Lines 639 to 655 call the destructor for the first time and count the number of variables cleared
Lines 657~678 clear variables
Lines 682~686 release memory
Lines 687~689 process garbage count statistics and restore free_list and next_to_free variables

The above is the entire content of this article, I hope it will be helpful Everyone’s learning will be helpful. For more related content, please pay attention to the PHP Chinese website!

Related recommendations:

A brief discussion of PHP source code thirty-three: The basics of the new garbage collection mechanism (Garbage Collection) in PHP5.3

A brief discussion of PHP source code thirty-two : emalloc/efree layer and heap layer in PHP memory pool

A brief discussion on PHP source code twenty-nine: About the inheritance of interfaces

The above is the detailed content of A brief discussion of PHP source code 34: PHP5.3’s newly added garbage collection mechanism (Garbage Collection). For more information, please follow other related articles on the PHP Chinese website!

Statement:
The content of this article is voluntarily contributed by netizens, and the copyright belongs to the original author. This site does not assume corresponding legal responsibility. If you find any content suspected of plagiarism or infringement, please contact admin@php.cn