Home  >  Article  >  Backend Development  >  PHP array implementation principle

PHP array implementation principle

angryTom
angryTomOriginal
2019-08-22 15:36:544665browse

PHP array implementation principle

Arrays are the most commonly used data type in PHP. At the same time, PHP is easy to use thanks to its powerful arrays, but how are arrays implemented in PHP?

Recommended tutorial: PHP video tutorial

First of all, let’s understand the relevant data structure first, as follows Lay a good foundation for the content

Hash table

Hash table, as the name suggests, is a kind of mapping different keywords to different units data structure. The method of mapping different keywords to different units is called hash function

Ideally, after hash function processing, keywords and units will correspond one-to-one; but if the keyword value is enough In many cases, it is easy for multiple keywords to be mapped to the same unit, that is, hash conflicts.

The solution to hash conflicts is to use either the chaining method or the open addressing method

Linking method
That is, when different keywords are mapped to the same unit, a linked list is used to save these keywords in the same unit

Open addressing Method
That is, when inserting data, if it is found that there is data in the unit to which the keyword is mapped, it means that a conflict has occurred, and then continue to search for the next unit until an available unit is found

And Because the open addressing method occupies the location of other keyword mapping units, subsequent keywords are more likely to have hash conflicts, and therefore are prone to performance degradation

Linked list

Since we mentioned linked lists above, here we briefly talk about the basics of linked lists. There are many types of linked lists. Commonly used data structures include: queue, stack, two-way linked list, etc.

A linked list is a data structure composed of different linked list nodes. Linked list nodes generally consist of elements pointing to pointers to the next node. The doubly linked list, as the name suggests, is composed of a pointer element pointing to the previous node and a pointer pointing to the next node. As for the content of the data structure, we will not expand too much. We will have special content to introduce it in detail later. Data structure

php array The way PHP resolves hash conflicts is to use the linking method, so the PHP array is a linked list of hash tables The implementation, to be precise, is implemented by a hash table doubly linked list.

Internal structure - Hash table

The HashTable structure is mainly used to store the basic information of the hash table

typedef struct _hashtable { 
    uint nTableSize;        // hash Bucket的大小,即哈希表的容量,最小为8,以2x增长。
    uint nTableMask;        // nTableSize-1 , 索引取值的优化
    uint nNumOfElements;    // hash Bucket中当前存在的元素个数,count()函数会直接返回此值 
    ulong nNextFreeElement; // 下一个可使用的数字键值
    Bucket *pInternalPointer;   // 当前遍历的指针(foreach比for快的原因之一)
    Bucket *pListHead;          // 存储整个哈希表的头元素指针
    Bucket *pListTail;          // 存储整个哈希表的尾元素指针
    Bucket **arBuckets;         // 存储hash数组
    dtor_func_t pDestructor;    // 在删除元素时执行的回调函数,用于资源的释放
    zend_bool persistent;       //指出了Bucket内存分配的方式。如果persisient为TRUE,则使用操作系统本身的内存分配函数为Bucket分配内存,否则使用PHP的内存分配函数。
    unsigned char nApplyCount; // 标记当前hash Bucket被递归访问的次数(防止多次递归)
    zend_bool bApplyProtection;// 标记当前hash桶允许不允许多次访问,不允许时,最多只能递归3次
#if ZEND_DEBUG
    int inconsistent;
#endif
} HashTable;

The Bucket structure is used for The specific content of the saved data

typedef struct bucket {
    ulong h;            // 对char *key进行hash后的值,或者是用户指定的数字索引值
    uint nKeyLength;    // hash关键字的长度,如果数组索引为数字,此值为0
    void *pData;        // 指向value,一般是用户数据的副本,如果是指针数据,则指向pDataPtr
    void *pDataPtr;     // 如果是指针数据,此值会指向真正的value,同时上面pData会指向此值
    struct bucket *pListNext;   // 指向整个哈希表的该单元的下一个元素
    struct bucket *pListLast;   // 指向整个哈希表的该单元的上一个元素
    struct bucket *pNext;       // 指向由于哈希冲突导致存放在同一个单元的链表中的下一个元素
    struct bucket *pLast;       // 指向由于哈希冲突导致存放在同一个单元的链表中的上一个元素
    // 保存当前值所对于的key字符串,这个字段只能定义在最后,实现变长结构体
    char arKey[1];              
} Bucket;

The Bucket structure contains the pData element that points to the user data, which actually points to the variable zval structure we introduced before. This is why when creating an array, array element 1 appears. variable container.

Hash table internal structure diagram

PHP array implementation principle From the above figure we can see that when Bucket stores data, if If there is a hash conflict, multiple keywords will be mapped to the linked list, thus forming a doubly linked list

Summary Today, we Using arrays as the entry point, we briefly understood the basic data structures: hash tables and linked lists; and also learned about the underlying implementation of arrays, that is, hash tables doubly linked lists. In fact, the hash table is the most important data structure in PHP and has many uses. Variable symbol tables, function lists, etc. are all stored using hash tables

The above is the detailed content of PHP array implementation principle. For more information, please follow other related articles on the PHP Chinese website!

Statement:
The content of this article is voluntarily contributed by netizens, and the copyright belongs to the original author. This site does not assume corresponding legal responsibility. If you find any content suspected of plagiarism or infringement, please contact admin@php.cn