Home  >  Article  >  Backend Development  >  Classic conflict handling of hash tables (hash tables) in data structures

Classic conflict handling of hash tables (hash tables) in data structures

little bottle
little bottleOriginal
2019-04-04 14:46:484502browse

Hash is to establish a certain correspondence relationship f between the storage location of the record and its keywords, so that each keyword key corresponds to a storage location f (key), and establishes the mutual relationship between the keyword and the storage location. Correspondence relationship, this relationship f is called hash function (hash function). The editor of this article mainly talks about the conflict handling problem of hash function.


Classic conflict handling of hash tables (hash tables) in data structures

During the search process, the number of key code comparisons depends on the number of conflicts. The fewer conflicts, the higher the search efficiency. , there will be many conflicts, and the search efficiency will be low. Therefore, factors that affect the number of conflicts are factors that affect search efficiency. There are the following three factors that affect the number of conflicts:

1. Whether the hash function is uniform;

2. The method of handling conflicts;

3. The filling factor of the hash table .

The filling factor of the hash table is defined as: α = the number of elements filled in the table / the length of the hash table

α is a sign factor of the degree of filling of the hash table. Since the table length is a fixed value, α is proportional to the "number of elements filled in the table". Therefore, the larger α is, the more elements are filled in the table, and the greater the possibility of conflict; the smaller α, The fewer elements that populate the table, the less likely there will be conflicts.

In fact, the average search length of the hash table is a function of the filling factor α, but different methods of handling conflicts have different functions.

The methods to resolve hash conflicts generally include:

NO.1 Open addressing method

The so-called open addressing method means that once a conflict occurs, look for the next empty address. As long as the hash table is large enough, the empty hash address can always be found and the record will be stored.

Formula: f(key)=(f(key) di)%m(di=1,2,3….m-1)

For example, the keyword set is { 12, 67, 56, 16, 25, 37, 22, 29, 15, 47, 48, 34}, the table length is 12. Hash function f(key) = key mod 12.

When calculating the first five numbers {12, 67, 56, 16, 25}, they are all hash addresses without conflict and are stored directly; when calculating key = 37, it is found that f(37) = 1. At this time, it conflicts with the location of 25. So apply the above formula f(37) = (f(37) 1) mod 12 =2,. So 37 is stored at the location with index 2. The next 22, 29, 15, and 47 have no conflicts and are deposited normally. At 48, we calculate f(48) = 0, which conflicts with the 0 position of 12. It doesn't matter, we f(48) = (f(48) 1) mod 12 = 1, which conflicts with the position of 25. . So f(48) = (f(48) 2) mod 12 = 2, there is still a conflict... There will be no vacancies until f(48) = (f(48) 6) mod 12 = 6. As shown in the table below.

##6756

NO.2 Re-hashing method

For hash tables, multiple hash functions can be prepared in advance.

Formula: fi(key)=RHi(key)(i=1,2,3...,k)

Here RHi is a different hash function, you can divide and leave the remainder, Folding, squaring and centering are all used. Whenever a hash address conflict occurs, a different hash function is used for calculation.

This method can prevent keyword aggregation, but it also increases the calculation time accordingly.

NO.3 Chain address method (zipper method)

Store all records whose keywords are synonyms in a single linked list. This type of table is called a synonym sublist. Only pointers to the front of all synonym sublists are stored in the hash table. For the keyword set {12, 67, 56, 16, 25, 37, 22, 29, 15, 47, 48, 34}, use the same 12 as the remainder as before and perform the division and remainder method to get the structure below.

Classic conflict handling of hash tables (hash tables) in data structures

NO.4 Establish a public overflow area

This method is to find a new address for you when you are working, and create a public overflow area for all conflicting keywords. overflow area for storage.

As far as the previous example is concerned, there are three keywords 37, 48, and 34 that conflict with the previous keyword positions, so store them in the overflow table. As shown below.

Classic conflict handling of hash tables (hash tables) in data structures

When searching, after calculating the hash address of the given value through the hash function, it is first compared with the corresponding position in the basic table. If they are equal, search Success; if not equal, perform a sequential search in the overflow table. If there are very few conflicting data compared to the basic table, the structure of the common overflow area is still very high for search performance.

[Recommended Courses: C Related Courses]

Serial number 0 1 2 3 4 5 6 7 8 9 10 11
Keywords 12 25

16




The above is the detailed content of Classic conflict handling of hash tables (hash tables) in data structures. For more information, please follow other related articles on the PHP Chinese website!

Statement:
The content of this article is voluntarily contributed by netizens, and the copyright belongs to the original author. This site does not assume corresponding legal responsibility. If you find any content suspected of plagiarism or infringement, please contact admin@php.cn