Home >Database >Redis >Introduction to the implementation principle of Redis partition

Introduction to the implementation principle of Redis partition

尚
forward
2020-03-25 09:40:162387browse

Introduction to the implementation principle of Redis partition

Redis Partitioning is Redis partitioning. Simply put, it is to distribute data to different redis instances. Therefore, the content stored in each redis instance is only a subset of all the content. .

Recommended: redis introductory tutorial

Why do we need partitioning? What was the motivation for partition? Generally speaking, the benefits of Redis partitioning are roughly as follows:

1. Performance improvement. The network I/O capabilities and computing resources of single-machine Redis are limited, and requests are distributed to multiple machines. Making full use of the computing power and network bandwidth of multiple machines helps improve the overall service capabilities of Redis.

2. Horizontal expansion of storage. Even if the service capabilities of Redis can meet application needs, as the storage data increases, a single machine is limited by the storage capacity of the machine itself, and the data is dispersed to multiple machines. Upper storage enables the Redis service to scale horizontally.

In general, partitioning makes our original problem of being limited by the hardware resources of a single computer no longer a problem. Not enough storage? Not enough computing resources? Not enough bandwidth? We can all solve these problems by adding more machines.

Redis Partition Basics

There are many specific strategies for partitioning in actual applications. For example, suppose we already have a set of four Redis instances, namely R0, R1, R2, R3. In addition, we have a batch of keys representing users, such as: user:1, user:2,...and so on. The number after "user:" represents the user's ID. What we need to do These keys are stored in four different Redis instances. How to do it? The simplest way is range partitioning. Let's take a look at how to do it based on range partitioning.

Range partitioning

The so-called range partitioning is to map all keys in a range to the same Redis instance. Adding the data set is still the user data mentioned above. The specific method is as follows:

We can map user data with user IDs from 0 to 10000 to the R0 instance, and map objects with user IDs from 10001 to 20000 to the R1 instance, and so on.

Although this method is simple, it is very effective in practical applications, but there are still problems:

We need a table, which is used to store the user ID range in Redis The mapping relationship of the instance, for example, user ID 0-10000 is mapped to the R0 instance...

We not only need to maintain this table, but we also need such a table for each object type. For example, we are currently storing user information. If we are storing order information, we need to create it again. A mapping relationship table.

What if the key of the data we want to store cannot be divided according to the range. For example, our key is a set of uuid. At this time, it is difficult to use range partitioning.

Therefore, in practical applications, range partitioning is not a good choice. Don’t worry, we have a better way. Let’s learn about hash partitioning.

Hash partition

An obvious advantage of hash partition compared to range partition is that hash partition is suitable for any form of key, unlike range partitioning. The form of key is object_name:, and the partitioning method is also very simple. It can be expressed by a formula:

id=hash(key)%N

where id represents the number of the Redis instance. The formula describes the first step based on key and a hash function. (such as crc32 function) calculates a numeric value. Following the above example, the first key we want to process is user:1, and the result of hash (user:1) is 93024922.

Then the hash result is modulo. The purpose of modulo is to calculate a value between 0 and 3, so this value can be mapped to one of our Redis instances. For example, if the result of 93024922%4 is 2, we will know that foobar will be stored on R2.

Different partition implementations

Partitions can be implemented in different parts of the redis software stack. Let’s take a look at the following:

Client implementation

Client implementation means that the key is determined on the redis client in which Redis instance it will be stored in, as shown in the figure below:

Introduction to the implementation principle of Redis partition

The above is a schematic diagram of the client's implementation of Redis partitioning.

Agent implementation

Agent implementation means that the client sends the request to the proxy server. The proxy server implements the Redis protocol, so the proxy server can proxy the communication between the client and the Redis server. . The proxy server forwards the client's request to the correct Redis instance through the configured partition schema, and returns the feedback message to the client. The schematic diagram of the agent implementing Redis partition is as follows:

Introduction to the implementation principle of Redis partition

Both Redis and Memcached agent Twemoroxy implement agent partitioning.

Query routing

Query routing is a Redis partitioning method implemented by Redis Cluster:

Introduction to the implementation principle of Redis partition

During the query routing process, we can randomly send the query request to any Redis instance. This Redis instance is responsible for forwarding the request to the correct Redis instance. Redis cluster implements a hybrid that cooperates with the client for query routing.

Disadvantages of Redis partitioning

Although Redis partitioning is so far so good so far, Redis partitioning has some fatal shortcomings, which causes some Redis functions to fail in the partitioning It does not work well in the environment. Let's take a look:

Multi-key operations are not supported. For example, the keys we want to operate in batches are mapped to different Redis instances.

Multi-key Redis transactions are not supported.

The minimum granularity of partitioning is the key, so we cannot map large data sets associated to one key to different instances.

When applying partitioning, data processing is very complex. For example, we need to process multiple rdb/aof files and gather files distributed in different instances for backup.

Adding and deleting machines is very complex. For example, Redis cluster supports almost runtime transparent rebalancing that needs to be done to add or reduce machines. However, methods such as client and agent partitioning do not support this. functional.

Persistent storage or caching

Although data partitioning is conceptually the same for Redis, whether it is data persistent storage or caching, however, for data Persistent storage still has a big limitation. When we use Redis as persistent storage, each key must always be mapped to the same Redis instance. When Redis is used as a cache, for this key, if one instance cannot be used, this key can also be mapped to other instances.

Consistent hashing implementations usually make it possible to map a key to another instance when the instance to which the key is mapped becomes unavailable. Similarly, if a machine is added, part of the keys will be mapped to the new machine. Two points we need to understand are as follows:

1. If Redis is used as a cache, and the requirements are easy Adding or removing machines is very simple using consistent hashing.

2. If Redis is used as (persistent) storage, a fixed key-to-instance mapping is required, so we can no longer flexibly add or delete machines. Otherwise, we need the system to be able to rebalance when adding or deleting machines, which is currently supported by Redis Cluster.

Pre-Sharding

Through the above introduction, we know that there are problems with the application of Redis partition. Unless we only use Redis as a cache, it will be difficult to add machines or Deleting a machine is very troublesome.

However, usually our Redis capacity changes are very common in practical applications. For example, I need 10 Redis machines today, and I may need 50 machines tomorrow.

Given that Redis is a very lightweight service (each instance only occupies 1M), a simple solution to the above problem is:

We can open multiple Redis instances, Even though it is a physical machine, we can start multiple instances at the beginning. We can choose some instances, such as 32 or 64 instances, as our working cluster. When one physical machine does not have enough storage, we can move the general instances to our second physical machine and pair them in sequence. We can ensure that the number of Redis instances in the cluster remains unchanged and achieve the purpose of expanding the machine.

How to move a Redis instance? When we need to move the Redis instance to an independent machine, we can do it through the following steps:

1. Start a new Redis instance on the new physical machine.

2. Use the new physical machine as the slave machine to be moved.

3. Stop the client.

4. Update the IP address of the Redis instance to be moved.

5. Send the SLAVEOF ON ONE command to the slave machine.

6. Use the new IP to start the Redis client.

7. Close the Redis instance that is no longer in use.

Summary

On the basis of understanding the concept of Redis partition, this article introduces several common implementation methods and principles of Redis partition. Finally, Pre is introduced based on the problems encountered in the implementation. -Sharding solution.

Related recommendations:

mysql video tutorial: https://www.php.cn/course/list/51.html

The above is the detailed content of Introduction to the implementation principle of Redis partition. For more information, please follow other related articles on the PHP Chinese website!

Statement:
This article is reproduced at:oschina.net. If there is any infringement, please contact admin@php.cn delete