Home > Article > Web Front-end > what is javascript hash
In JavaScript, hash refers to a hash table, which is a data structure that directly accesses the memory storage location based on keywords; through the hash table, the storage location of the data element and the keyword of the data element are A certain correspondence is established between them, and the function that establishes this correspondence is called a hash function.
The operating environment of this tutorial: windows7 system, javascript version 1.8.5, Dell G3 computer.
The basic concept of javascript hash:
Hash table (hash table) is a kind of data that directly accesses memory storage locations based on keywords Structure, through the hash table, a certain correspondence is established between the storage location of the data element and the key of the data element. The function that establishes this correspondence is called a hash function.
Hash table construction method:
Assume that the number of data elements to be stored is n, set a continuous storage unit with a length of m (m > n), respectively Taking the keyword Ki (0
From a mathematical point of view, the hash function is actually a mapping of keywords to memory units, so we hope to use the hash function to make the Huaxi address calculated by the hash function as uniform as possible through the simplest operation possible. Back mapping to a series of memory units, there are three key points in constructing a hash function: (1) The operation process should be as simple and efficient as possible to improve the insertion and retrieval efficiency of the hash table; (2) The hash function should have good Hash type to reduce the probability of hash collision; third, the hash function should have greater compression to save memory.
Commonly used methods:
Solution to hash conflict:
In the construction When using a hash table, there is a problem: for two different keywords, when calculating the hash address through our hash function, we get the same hash address. We call this phenomenon a hash conflict.
Hash conflict is mainly related to two factors
(1) Filling factor, the filling factor refers to what has been stored in the hash table The ratio of the number of data elements entered to the size of the hash address space, a=n/m; the smaller a, the smaller the possibility of conflict, on the contrary, the possibility of conflict is greater; but the smaller a, the better the space utilization The smaller, the larger a, the higher the space utilization. In order to take into account hash conflict and storage space utilization, a is usually controlled between 0.6-0.9, while HashTable in .net directly defines the maximum value of a. is 0.72 (Although Microsoft's official MSDN states that the default filling factor of HashTable is 1.0, it is actually a multiple of 0.72)
(2) It is related to the hash function used. If the hash function is appropriate, the hash function can be made Hash addresses are distributed as evenly as possible in the hash address space, thereby reducing the occurrence of conflicts, but the acquisition of a good hash function largely depends on a lot of practice.
1) Open addressing Method
Hi=(H(key) di) MOD m i=1,2,…k(k Where H(key) is the hash function; m is the hash Table length; di is an incremental sequence. There are 3 incremental sequences:
①Linear detection and then hashing: di=1,2,3,…,m-1
②Secondary detection and then hashing: di=12,- 12,22,-22,…±k^2(k ③Pseudo-random detection and then hashing: di=pseudo-random number sequence
shortcoming:
We can see a phenomenon: when records are already filled in at positions i, i 1, and i 2 in the table, the next records with hash addresses of i, i 1, i 2, and i 3 will be filled in. The position of i 3. This phenomenon of two records with different first hash addresses competing for the same subsequent hash address during the process of handling conflicts is called "secondary aggregation", that is, in the process of handling conflicts of synonyms Added non-synonymous conflicts. But on the other hand, using linear detection and then hashing to handle conflicts can guarantee that as long as the hash table is not full, an address Hk that does not conflict can always be found. Secondary detection and rehashing is only possible when the hash table length m is a prime number of the form 4j 3 (j is an integer). That is, the open addressing method will cause secondary aggregation, which is detrimental to search.
2) Rehash method
Hi = RHi (key), i=1,2,…k RHi are all different hash functions, that is, When a synonym produces an address conflict, another hash function address is calculated until no conflict occurs. This method is less prone to aggregation, but increases calculation time.
Disadvantages: Increased calculation time.
3) Chain address method (zipper method)
Store all records whose keywords are synonyms in the same linear linked list.
Advantages:
①The zipper method is simple to handle conflicts and there is no accumulation phenomenon, that is, non-synonymous words will never conflict, so the average search length is shorter;
②Because the zipper method The node space on each linked list is dynamically applied, so it is more suitable for situations where the table length cannot be determined before creating the table;
③ In order to reduce conflicts, the open addressing method requires the filling factor α to be small, so when the node size When larger, a lot of space is wasted. In the zipper method, α ≥ 1 is acceptable, and when the node is large, the pointer domain added in the zipper method can be ignored, thus saving space;
④ In the hash table constructed with the zipper method, the operation of deleting nodes Easy to implement. Just simply delete the corresponding node on the linked list. For the hash table constructed by the open address method, deleting a node cannot simply leave the space of the deleted node empty, otherwise the search path of the synonym node that is filled in the hash table after it will be truncated. This is because in various open address methods, empty address units (ie open addresses) are conditions for search failure. Therefore, when performing a deletion operation on a hash table that uses the open address method to handle conflicts, it can only mark the deleted node for deletion, but cannot actually delete the node.
Disadvantages:
The disadvantage of the zipper method is that pointers require additional space. Therefore, when the node size is small, the open addressing method is more space-saving, and if the saved pointer space is used Increasing the size of the hash table can make the fill factor smaller, which in turn reduces conflicts in the open addressing method and thus increases the average search speed.
#4) Establish a public overflow area
Assuming that the value range of the hash function is [0,m-1], then let the vector HashTable[0...m-1] is the basic table, each component stores a record, and the vector OverTable[0...v] is set up as the overflow table. All records whose keywords are synonymous with the keywords in the basic table, regardless of their hash addresses obtained by the hash function, will be filled in the overflow table in the event of a conflict.
A hash table implementation of a simple hash function without conflict handling
class Hash { constructor() { this.table = new Array(1024); } hash(data) { //就将字符串中的每个字符的ASCLL码值相加起来,再对数组的长度取余 var total = 0; for (var i = 0; i < data.length; i++) { total += data.charCodeAt(i); } console.log("Hash Value: " + data + " -> " + total); return total % this.table.length; } insert(key, val) { var pos = this.hash(key); this.table[pos] = val; } get(key) { var pos = this.hash(key); return this.table[pos] } show() { for (var i = 0; i < this.table.length; i++) { if (this.table[i] != undefined) { console.log(i + ":" + this.table[i]); } } } } var someNames = ["David", "Jennifer", "Donnie", "Raymond", "Cynthia", "Mike", "Clayton", "Danny", "Jonathan"]; var hash = new Hash(); for (var i = 0; i < someNames.length; ++i) { hash.insert(someNames[i], someNames[i]); } hash.show();
uses the square middle method to construct the hash function and the open address method for linear detection method to resolve conflicts.
class Hash { constructor() { this.table = new Array(1000); } hash(data) { var total = 0; for (var i = 0; i < data.length; i++) { total += data.charCodeAt(i); } //把字符串转化为字符用来求和之后进行平方运算 var s = total * total + "" //保留中间2位 var index = s.charAt(s.length / 2 - 1) * 10 + s.charAt(s.length / 2) * 1 console.log("Hash Value: " + data + " -> " + index); return index; } solveClash(index, value) { var table = this.table //进行线性开放地址法解决冲突 for (var i = 0; index + i < 1000; i++) { if (table[index + i] == null) { table[index + i] = value; break; } } } insert(key, val) { var index = this.hash(key); //把取中当做哈希表中索引 if (this.table[index] == null) { this.table[index] = val; } else { this.solveClash(index, val); } } get(key) { var pos = this.hash(key); return this.table[pos] } show() { for (var i = 0; i < this.table.length; i++) { if (this.table[i] != undefined) { console.log(i + ":" + this.table[i]); } } } } var someNames = ["David", "Jennifer", "Donnie", "Raymond", "Cynthia", "Mike", "Clayton", "Danny", "Jonathan"]; var hash = new Hash(); for (var i = 0; i < someNames.length; ++i) { hash.insert(someNames[i], someNames[i]); } hash.show();
[Recommended learning: javascript advanced tutorial]
The above is the detailed content of what is javascript hash. For more information, please follow other related articles on the PHP Chinese website!