Home > Article > Backend Development > Parsing PHP8 underlying kernel source code - array (2)
This article introduces to you "Analysis of PHP8 underlying kernel source code - array (2)". It has certain reference value. Friends in need can refer to it. I hope it will be helpful to everyone.
Recommended related articles: "Analysis of PHP8 underlying kernel source code - array (1) " "Analysis of PHP8 underlying kernel source code - array (3) " " Analysis of PHP8 underlying kernel source code - array (4) 》
zend_array is divided into two types in PHP
1.packed array
2.hash array
在上文中 补齐了zend_array的 所有值的 注释
In fact, the order in the source code is slightly different from my above. I think my above order is more reasonable to understand.
//源码里的代码 typedef struct _zend_array HashTable; struct _zend_array { zend_refcounted_h gc; union { struct { ZEND_ENDIAN_LOHI_4( zend_uchar flags, zend_uchar _unused, zend_uchar nIteratorsCount, zend_uchar _unused2) } v; uint32_t flags; } u; uint32_t nTableMask; Bucket *arData; uint32_t nNumUsed; uint32_t nNumOfElements; uint32_t nTableSize; uint32_t nInternalPointer; zend_long nNextFreeElement; dtor_func_t pDestructor; }; //我调换下顺序后的代码 struct _zend_array { zend_refcounted_h gc; /// gc 占用8个字节 用于引用计数和 字符串类型的记录 union { struct { ZEND_ENDIAN_LOHI_4( zend_uchar flags, // flags 8位的无符号字符, 最大值为255 标记HashTable用 PHP8 中有6个值 zend_uchar _unused, zend_uchar nIteratorsCount, //迭代器计数。foreach语句会在全局变量EG中创建一个迭代器, //迭代器包含正在遍历的HashTable和游标信息。 //nIteratorsCount记录了当前runtime正在迭代当前HashTable的迭代器的数量。 zend_uchar _unused2) } v; //这里有点不一样 看陈雷大佬书中 v结构体还包括 u.v.nApplyCount和u.v.consistency uint32_t flags; // } u; // u是是一个联合体。占用4个字节。 //可以存储一个uint32_t类型的flags,也可以存储由4个unsigned char组成的结构体v, //这里的宏ZEND_ENDIAN_LOHI_4是为了兼容不同操作系统的大小端,可以忽略。 Bucket *arData; //HashTable中存储数据的单元的指针。 // 用来存储key和value以及辅助信息的容器。 uint32_t nTableSize; // HashTable的大小。表示arData指向的bucket数组的大小,即所有bucket的数量。 //该字段取值始终是2n,最小值是8,最大值在64位系统中是0x80000000(2的31次幂)。 uint32_t nNumUsed; //指所有已使用bucket的数量,包括有效bucket和无效bucket的数量 uint32_t nNumOfElements; //有效bucket的数量。该值总是小于或等于nNumUsed uint32_t nTableMask; //索引大小。一般值为 -nTableSize。 uint32_t nInternalPointer; //全局默认游标。reset/key/current/next/prev等宏 和操作都会用到 zend_long nNextFreeElement; //下一个插入的元素的key的下标 //比如 当$a[] = 1 nNextFreeElement =1 dtor_func_t pDestructor; //指向一个函数 typedef void (*dtor_func_t)(zval *pDest); //可以看出是pDest是zval结构指针二级指针, //为什么会是二级指针,因为c语言函数传递都是值传递,要改变指针值只能将指针地址传入 //当bucket元素被更新或者被删除时,会对bucket的value调用该函数, //如果value是引用计数的类型,那么会对value引用计数减1,进而引发可能的gc。 };
The member variable diagram generated by the understand tool is as follows
After all expansions are as follows
zend_array structure member
It can be seen that the core is z_val zend_string zend_refcounted_h Bucket Layers upon layers
The Bucket stores the key information of the array
typedef struct _Bucket { zval val; //数组的值 ( 复习下 zval只有16个字节) zend_ulong h; // key的 h 值 zend_string *key; //当数组为 hash_array时候 会用到 也就是 key的值 } Bucket;
No matter the array type is packed_array or hash_array, it will eventually be stored in the Bucket
When the keys are all numeric keys and the keys are increasing in insertion order, the array type is packed_array
##Characteristics of packed arraywhere The third and fourth items can be understood as if the array in PHP does not write a key, then the default key will be sorted starting from 0
$a =array(1,2,3); // packed array $b =array(1=>'a',3=>'b',5=>'c'); //packed arrayThere will be an index array before the bucket arrayWhen it is a packed array, the size of the index array is always 2 because it is not used ItThe content in the zend_array corresponding to $a is nTableSize; Represents the size of the bucket array pointed to by arData, that is, the number of all buckets. =The total size of the arraynNumUsed; Refers to the number of all used buckets, including the number of valid and invalid buckets
nNumOfElements; The number of valid buckets.
So nNumOfElements nNumUsed =nTableSize
nTableMask; Index size. Because packed array does not use an index, it is always -2
nNextFreeElement; The subscript of the key of the next inserted element
packed array takes advantage of the continuity characteristics of the bucket array. For some Optimization for scenarios with only digital keys. Since the index array is no longer needed, (nTableSize-2)* sizeof(uint32_t) bytes are saved from the memory space. In addition, since accessing the bucket directly operates the bucket array, the performance is also improved.
If the conditions of packed array are not met, the array is represented by hash_array in PHPAll key values that are not numbers are represented by hash_array
$c =array('x'=>1,'y'=>2,'z'=>3,'a'=>0);
The $c above will be represented by hash_array
bucket is as follows
zend_array is as follows
nTableSize; Represents the size of the bucket array pointed to by arData, that is, the number of all buckets. =8
nNumUsed; Refers to the number of all used buckets, including the number of valid and invalid buckets =4
##nNumOfElements; The number of valid buckets. =4
So nNumOfElements nNumUsed =nTableSize
nTableMask; Index size. -8
nNextFreeElement; The subscript of the key of the next inserted element hash_array will always be 0 if it is not used
▏This article was published on the php Chinese website with the consent of the original author PHP Cui Xuefeng , original address: https://zhuanlan.zhihu.com/p/358354087
The above is the detailed content of Parsing PHP8 underlying kernel source code - array (2). For more information, please follow other related articles on the PHP Chinese website!