Hash Table is the core of PHP. This is not an exaggeration at all.
PHP’s arrays, associative arrays, object properties, function tables, symbol tables, etc. all use HashTable as a container.
PHP’s HashTable uses the zipper method to resolve conflicts. Needless to say, I am mainly focusing on PHP’s Hash algorithm today and some of the ideas revealed by the algorithm itself.
PHP’s Hash uses the most common DJBX33A (Daniel J. Bernstein, Times 33 with Addition). This algorithm is widely used in multiple software projects, such as Apache, Perl and Berkeley DB. For strings This is the best hashing algorithm currently known, because the algorithm is very fast and the classification is very good (little collision, even distribution).
The core idea of the algorithm is:
Copy the code The code is as follows:
hash(i) = hash(i- 1) * 33 + str[i]
In zend_hash.h, we can find this algorithm in PHP:
Copy code The code is as follows:
static inline ulong zend_inline_hash_func(char *arKey, uint nKeyLength)
{
register ulong hash = 5381;
/* variant with the hash unrolled eight times */
for (; nKeyLength >= 8; nKeyLength -= {
hash = ((hash < < 5) + hash) + *arKey++;
hash = ((hash << 5) + hash) + *arKey++;
hash = ((hash << 5) + hash) + *arKey++;
hash = ((hash << 5) + hash) + *arKey++;
hash = ((hash << 5) + hash) + *arKey++;
hash = ((hash << 5) + hash) + *arKey++;
hash = ((hash << 5) + hash) + *arKey++;
hash = ((hash << 5 ) + hash) + *arKey++;
}
switch (nKeyLength) {
case 7: hash = ((hash << 5) + hash) + *arKey++; /* fallthrough... */
case 6: hash = ((hash << 5) + hash) + *arKey++; /* fallthrough... */
case 5: hash = ((hash << 5 ) + hash) + *arKey++; /* fallthrough... */
case 4: hash = ((hash << 5) + hash) + *arKey++; /* fallthrough... */
case 3: hash = ((hash << 5) + hash) + *arKey++; /* fallthrough... */
case 2: hash = ((hash << 5) + hash) + *arKey++; /* fallthrough... */
case 1: hash = ((hash << 5) + hash) + *arKey++; break;
case 0: break;
EMPTY_SWITCH_DEFAULT_CASE ()
}
return hash;
}
Compared to the classic Times 33 algorithm adopted directly in Apache and Perl:
Copy code The code is as follows:
hashing function used in Perl 5.005:
# Return the hashed value of a string : $hash = perlhash("key")
# (Defined by the PERL_HASH macro in hv.h)
sub perlhash
{
$hash = 0;
foreach (split // , shift) {
In PHP’s hash algorithm, we can see very subtle differences.
First of all, the most different thing is that PHP does not use direct multiplication by 33, but uses:
Copy code
The code is as follows:
hash << 5 + hash
Of course this will be faster than taking a ride.
Then, what is particularly important is to use unrolled. I read an article a few days ago about Discuz’s caching mechanism. One of them said that Discuz will adopt different caching strategies based on the popularity of the post and user habits. , and only cache the first page of the post (because few people will read the post).
Similar to this idea, PHP encourages character indexes of less than 8 digits. It uses unrolled in units of 8 to improve efficiency. It has to be said that this is also a very detailed and meticulous place.
In addition, there are inline and register variables... It can be seen that PHP developers have also taken great pains to optimize hash
Finally, the initial value of hash is set to 5381. Compared with the times algorithm in Apache and the Hash algorithm in Perl (both use an initial hash of 0), why choose 5381? I don’t know the specific reason. , but I discovered some features of 5381:
Copy code The code is as follows:
Magic Constant 5381:
1. odd number
2. prime number
3. deficient number
After reading this, I have reason to believe that the selection of this initial value can provide better classification.
http://www.bkjia.com/PHPjc/735239.htmlwww.bkjia.comtruehttp: //www.bkjia.com/PHPjc/735239.htmlTechArticleHash Table is the core of PHP. This is not an exaggeration at all. PHP's arrays, associative arrays, object properties, function tables, symbol tables, etc. all use HashTable as a container. PHP's HashTable uses...