Home >Backend Development >PHP Tutorial >PHP extension and embedding--Arrays and hash tables in PHP extension 1_PHP tutorial
In PHP, the underlying implementation of arrays is hash tables, which appear in the form of key-value. In the Zend engine of PHP, there are special APIs for operating hash tables for different hash table operations.
Creation
For hash tables, the initialization method is the same every time, and is completed by the following function zend_hash_init:
int zend_hash_init(HashTable *ht, uint nSize, hash_func_t pHashFunction, dtor_func_t pDestructor, zend_bool persistent)where ht is a pointer to the hash table, which can refer to an existing hashtable variable. You can also apply for memory for a new hashtable. The general method is:
ALLOC_HASHTABLE(ht), equivalent to ht = emalloc(sizeof(HashTable));.
nSize is the maximum number of elements in the hash table, which is considered to apply for memory in advance. If it is not an exponential multiple of 2, it will grow according to the following formula nSize = pow(2, ceil(log(nSize, 2)));, for example, if 5 is given, it will grow to 8. This should be for memory management comparison Convenience mechanism employed.
pHashFunction belongs to the zend eigine function of the previous version and can always be set to NULL in the new version.
pDestructor points to the entrance of the method called when an element in the hash table is deleted (zend_hash_del() zend_hash_update()), which is a corresponding callback function. If the method_name function is given, then when the function is implemented:
void method_name(void *pElement)
pElement points to the deleted element
persistent is a flag indicating whether it is a persistent hash table. Persistent data is independent of the request and will not be logged out during RSHUTDOWN. But if set to 1, ht must use pemalloc() when applying for memory.
For example: when initializing symbol_table in each php request life cycle, you will see zend_hash_init(&EG(symbol_table), 50, NULL, ZVAL_PTR_DTOR, 0);
Whenever unset, the corresponding zval* stored in the hash table is sent to zval_ptr_dtor() for destruction.
Population:
There are four main functions for inserting and updating data in a hash table:
int zend_hash_add(HashTable *ht, char *arKey, uint nKeyLen, void *pData, uint nDataSize, void **pDest); int zend_hash_update(HashTable *ht, char *arKey, uint nKeyLen, void *pData, uint nDataSize, void **pDest); int zend_hash_index_update(HashTable *ht, ulong h, void *pData, uint nDataSize, void **pDest); int zend_hash_next_index_insert(HashTable *ht, void *pData, uint nDataSize, void **pDest);The first two functions add data with string index to the hashtable, such as $foo['bar'] = 'barvalue' in php, then in the extension:
zend_hash_add(fooHashTbl, "bar", sizeof("bar"), &barZval, sizeof(zval*), NULL);
Add the corresponding key value and corresponding table value to the hashtable.
The only difference between add and update is that if the key already exists, add will fail.
The last two functions are to add numerical index data to ht.
The zend_hash_next_index_insert() function does not require an index value parameter, but directly calculates the next numeric index value by itself.
If you want to get the numeric index value of the next element yourself, you can also get the index through zend_hash_next_free_element().
ulong nextid = zend_hash_next_free_element(ht);
zend_hash_index_update(ht, nextid, &data, sizeof(data), NULL);
The above code is equivalent to:
zend_hash_next_index_insert(HashTable *ht, &data,sizeof(data),NULL).
The pDest parameter can be used to store the address value of the newly added element.
Recall: Find
Generally speaking, there are two ways to obtain data in a hash table:
int zend_hash_find(HashTable *ht, char *arKey, uint nKeyLength, void **pData); int zend_hash_index_find(HashTable *ht, ulong h, void **pData);
void hash_sample(HashTable *ht, sample_data *data1) { sample_data *data2; ulong targetID = zend_hash_next_free_element(ht);//获取下一个索引的位置 if (zend_hash_index_update(ht, targetID, data1, sizeof(sample_data), NULL) == FAILURE) {//把数据data1插入到哈希表的下一个索引的位置中去 /* Should never happen */ return; } if(zend_hash_index_find(ht, targetID, (void **)&data2) == FAILURE) {//利用id去寻找哈希表中的值,如果找到的话把值放在data2中。 /* Very unlikely since we just added this element */ return; } /* data1 != data2, however *data1 == *data2 */ }In addition to obtaining the value in the hash table, sometimes it is more important to know the existence of some elements:
int zend_hash_exists(HashTable *ht, char *arKey, uint nKeyLen); int zend_hash_index_exists(HashTable *ht, ulong h);分别针对字符串索引和数字的索引。返回的是1和0.
if (zend_hash_exists(EG(active_symbol_table), "foo", sizeof("foo"))) {//确定活动的符号表中是否存在foo变量 /* $foo is set */ } else { /* $foo does not exist */ }
ulong zend_get_hash_value(char *arKey, uint nKeyLen);用这个返回值传给下面的quick系列函数就可以达到加速的目的:
int zend_hash_quick_add(HashTable *ht, char *arKey, uint nKeyLen, ulong hashval, void *pData, uint nDataSize, void **pDest); int zend_hash_quick_update(HashTable *ht, char *arKey, uint nKeyLen, ulong hashval, void *pData, uint nDataSize, void **pDest); int zend_hash_quick_find(HashTable *ht, char *arKey, uint nKeyLen, ulong hashval, void **pData); int zend_hash_quick_exists(HashTable *ht, char *arKey, uint nKeyLen, ulong hashval);
void php_sample_hash_copy(HashTable *hta, HashTable *htb, char *arKey, uint nKeyLen TSRMLS_DC) { ulong hashval = zend_get_hash_value(arKey, nKeyLen);//获得用来加速的散列值hashval zval **copyval; if (zend_hash_quick_find(hta, arKey, nKeyLen, hashval, (void**)©val) == FAILURE) {//首先要在hta table里面找到相应的元素,并且存储在copyval中。 /* arKey doesn't actually exist */ return; } /* The zval* is about to be owned by another hash table */ (*copyval)->refcount__gc++;//相应zval*变量的引用次数+1 zend_hash_quick_update(htb, arKey, nKeyLen, hashval, copyval, sizeof(zval*), NULL);//把从hta中拿来的copyval放在htb里面。 }
typedef void (*copy_ctor_func_t)(void *pElement); void zend_hash_copy(HashTable *target, HashTable *source, copy_ctor_func_t pCopyConstructor, void *tmp, uint size);在source中的每个元素都会被拷贝到target中.通过pCopyConstructor的处理可以使得在拷贝变量的时候对这些变量的ref_count进行加一的操作。target中原有的与source中索引位置相同的元素会被替换掉,而其他的元素则会被保留。
void zend_hash_merge(HashTable *target, HashTable *source, copy_ctor_func_t pCopyConstructor, void *tmp, uint size, int overwrite);主要是多了一个overwrite的参数,如果非0,那就跟copy一样,如果是0,那就对于已经存在的元素就不会进行复制了。
typedef zend_bool (*merge_checker_func_t)(HashTable *target_ht, void *source_data, zend_hash_key *hash_key, void *pParam); void zend_hash_merge_ex(HashTable *target, HashTable *source, copy_ctor_func_t pCopyConstructor, uint size, merge_checker_func_t pMergeSource, void *pParam);pMergeSource回调函数使得可以选择性的进行合并,而不是全部合并,这个给人的感觉有点像c语言里面快速排序函数所留的函数入口,可以决定排序的方式。
zend_bool associative_only(HashTable *ht, void *pData, zend_hash_key *hash_key, void *pParam) { /* True if there's a key, false if there's not */ return (hash_key->arKey && hash_key->nKeyLength);//字符串类型的key,因为存在nKeyLength } void merge_associative(HashTable *target, HashTable *source) { zend_hash_merge_ex(target, source, zval_add_ref, sizeof(zval*), associative_only, NULL); }