A brief discussion on the dictionary, hash algorithm and ReHash principle in Redis
This article will introduce you to the dictionary, hash algorithm and ReHash principle in Redis. I hope it will be helpful to you!
Dictionaries in Redis are widely used to implement various functions of Redis, including databases and hash keys.
The underlying implementation of the dictionary is a hash table. Each dictionary has two hash tables, one is used normally and the other is used when rehash is used to expand the space. [Related recommendations: Redis video tutorial]
Dictionary structure definition
typedef struct dict { // 类型特定函数 dictType *type; // 私有数据 void *privdata; // 哈希表,两个元素 dictht ht[2] // rehash时记录的索引下标,当没有rehash时,值为-1 int rehashidx; } dict;
==When rehash is performed, rehashidx migrates each index The entry data will be 1;==
Among them, the structure of the hash table dicttht is defined as:
typedef struct dictht { // 哈希表数组 dictEntry **table; // 哈希表大小 unsigned long size; // 哈希表大小掩码,用于计算索引值 unsigned long sizenask; // 该哈希表已有节点的数量 unsigned long uesd; } dictht;
Among them, table is an array, and each element of the array points to a pointer of type dictEntry. The dictEntry type stores a key-value pair.
It can also be seen here that the nodes of the hash table are connected by a linked list to solve the hash conflict problem, which is the chain address method.
Hash conflict and hash algorithm
In order to achieve fast access from key to value, Redis uses a hash table to save all key-value pairs. Keycorresponds to the Key set by Redis, and valuecorresponds not to the value itself, but to the pointer pointing to the specific value. The biggest advantage of using a hash table is that you can quickly find key-value pairs with O(1) time complexity. But since it is a hash table, there will inevitably be a hash conflict problem.
Hash conflict means that when the hash value of two keys and the hash bucket are calculated, they happen to fall on the same hash bucket.
The way Redis solves hash conflicts is to use chain hashing, that is, zipper method. When multiple elements point to the same hash bucket, a linked list is used to save the corresponding data in the same hash bucket, and they are connected in turn using pointers.
Hash algorithm
When a new key-value pair is added to the dictionary, the program needs to first calculate the hash value and index based on the key-value pair. value, and then place the hash table node containing the new key-value pair at the specified index of the hash table array based on the index value.
reHash process
There is a load factor in the hash table to control the number of key-value pairs saved in the hash table. This requires a rehash operation to complete. Among them, the calculation formula of the load factor is:
// 负载因子 = 哈希表已保存的节点数量 / 哈希表大小 load_factor = ht[0].used / ht[0].size
The conditions for hash table expansion and contraction are as follows:
- The server is not currently executing the BGSAVE command or the BGREWRITEAOF command, and the hash table The load factor is greater than or equal to 1;
- The server is currently executing the BGSAVE command or the BGREWRITEAOF command, and the load factor of the hash table is greater than or equal to 5;
One of the above conditions is met, The rehash process will be executed.
If the server is executing BGSAVE or BGREWRITEAOF, Redis will create a child process of the current server process
The rehash process is roughly divided into three steps:
Allocate larger space to hash table 2, for example twice the current size of hash table 1;
Remap the data in hash table 1 and copy it to the hash In Table 2;
Release the space of hash table 1;
The size of the allocated space in the first step is determined by the current rehash Determined by the operation type and the number of key-value pairs in the current hash table.
-
When an expansion operation is performed, the allocated space size is the first 2^n value that is greater than or equal to (the number of key-value pairs in the hash table * 2);
Assume that the current number of key-value pairs is 4, then 4 * 2 = 8, because 8 is exactly equal to 2^3, which is exactly equal to the first value equal to 2^n, so the expansion space is 8;
If a shrink operation is performed, the allocated space size is the first 2^n value that is greater than or equal to (the number of key-value pairs in the hash table);
Progressive reHash
When there are a large number of hash tables, if all the data is copied at once, it is likely to affect the server. Therefore, Redis performs rehash in multiple times, which is progressive rehash.
To put it simply, in the second step, Redis still processes client requests normally. When processing a request, it starts from the first index position in hash table 1 and moves this index along the way. All entries elements at the position are copied to hash table 2; when the next request is made, the entries at the next index position are copied.
This cleverly allocates the cost of a large number of copies at one time to the process of processing multiple requests, avoiding time-consuming operations and ensuring fast access to data.
Hash table operations during rehash
When performing a progressive rehash operation, dictionary deletion, search, update and other operations will be performed in two hash tables. For example, if you want to find a key in the dictionary, you will first query it in the original table. If it cannot find it, you will query it in the new table.
And the dictionary addition operation will only be saved in the new table.
For more programming-related knowledge, please visit: Introduction to Programming! !
The above is the detailed content of A brief discussion on the dictionary, hash algorithm and ReHash principle in Redis. For more information, please follow other related articles on the PHP Chinese website!

Redis's data model and structure include five main types: 1. String: used to store text or binary data, and supports atomic operations. 2. List: Ordered elements collection, suitable for queues and stacks. 3. Set: Unordered unique elements set, supporting set operation. 4. Ordered Set (SortedSet): A unique set of elements with scores, suitable for rankings. 5. Hash table (Hash): a collection of key-value pairs, suitable for storing objects.

Redis's database methods include in-memory databases and key-value storage. 1) Redis stores data in memory, and reads and writes fast. 2) It uses key-value pairs to store data, supports complex data structures such as lists, collections, hash tables and ordered collections, suitable for caches and NoSQL databases.

Redis is a powerful database solution because it provides fast performance, rich data structures, high availability and scalability, persistence capabilities, and a wide range of ecosystem support. 1) Extremely fast performance: Redis's data is stored in memory and has extremely fast read and write speeds, suitable for high concurrency and low latency applications. 2) Rich data structure: supports multiple data types, such as lists, collections, etc., which are suitable for a variety of scenarios. 3) High availability and scalability: supports master-slave replication and cluster mode to achieve high availability and horizontal scalability. 4) Persistence and data security: Data persistence is achieved through RDB and AOF to ensure data integrity and reliability. 5) Wide ecosystem and community support: with a huge ecosystem and active community,

Key features of Redis include speed, flexibility and rich data structure support. 1) Speed: Redis is an in-memory database, and read and write operations are almost instantaneous, suitable for cache and session management. 2) Flexibility: Supports multiple data structures, such as strings, lists, collections, etc., which are suitable for complex data processing. 3) Data structure support: provides strings, lists, collections, hash tables, etc., which are suitable for different business needs.

The core function of Redis is a high-performance in-memory data storage and processing system. 1) High-speed data access: Redis stores data in memory and provides microsecond-level read and write speed. 2) Rich data structure: supports strings, lists, collections, etc., and adapts to a variety of application scenarios. 3) Persistence: Persist data to disk through RDB and AOF. 4) Publish subscription: Can be used in message queues or real-time communication systems.

Redis supports a variety of data structures, including: 1. String, suitable for storing single-value data; 2. List, suitable for queues and stacks; 3. Set, used for storing non-duplicate data; 4. Ordered Set, suitable for ranking lists and priority queues; 5. Hash table, suitable for storing object or structured data.

Redis counter is a mechanism that uses Redis key-value pair storage to implement counting operations, including the following steps: creating counter keys, increasing counts, decreasing counts, resetting counts, and obtaining counts. The advantages of Redis counters include fast speed, high concurrency, durability and simplicity and ease of use. It can be used in scenarios such as user access counting, real-time metric tracking, game scores and rankings, and order processing counting.

Use the Redis command line tool (redis-cli) to manage and operate Redis through the following steps: Connect to the server, specify the address and port. Send commands to the server using the command name and parameters. Use the HELP command to view help information for a specific command. Use the QUIT command to exit the command line tool.


Hot AI Tools

Undresser.AI Undress
AI-powered app for creating realistic nude photos

AI Clothes Remover
Online AI tool for removing clothes from photos.

Undress AI Tool
Undress images for free

Clothoff.io
AI clothes remover

AI Hentai Generator
Generate AI Hentai for free.

Hot Article

Hot Tools

SAP NetWeaver Server Adapter for Eclipse
Integrate Eclipse with SAP NetWeaver application server.

Dreamweaver CS6
Visual web development tools

Zend Studio 13.0.1
Powerful PHP integrated development environment

EditPlus Chinese cracked version
Small size, syntax highlighting, does not support code prompt function

MinGW - Minimalist GNU for Windows
This project is in the process of being migrated to osdn.net/projects/mingw, you can continue to follow us there. MinGW: A native Windows port of the GNU Compiler Collection (GCC), freely distributable import libraries and header files for building native Windows applications; includes extensions to the MSVC runtime to support C99 functionality. All MinGW software can run on 64-bit Windows platforms.