Interviewer: How to solve Redis data skew, hot spots and other issues-JavaInterview questions-php.cn

Home

Java

JavaInterview questions

Interviewer: How to solve Redis data skew, hot spots and other issues

Java后端技术全栈

Aug 15, 2023 pm 04:43 PM

javajava interview questions

As a mainstream technology, Redis has many application scenarios. Many interviews with large, medium and small factories have listed it as a key inspection content

A few days ago, there was a interview with Planet Small When I was studying with my partner, I encountered the following questions and came to consult Brother Tom

Considering that these problems are relatively high-frequency and often encountered at work, I will write an article to explain them systematically

Problem description:

Question to you: When reviewing redis, I have some questions, please take a look:

If the redis cluster has data skew and uneven data distribution, how to solve it?

When processing hotKey, create multiple copies of the key, such as k-1, k-2..., How to make these copies write evenly? How to evenly access?

Redis uses hash slot to maintain the cluster. Similar to consistent hashing, full migration can be avoided. Why not just use consistent hashing?

Reply:

As a performance accelerator, distributed cache plays a very important role in system optimization character of. Compared with local cache, although it adds a network transmission and takes less than 1 millisecond, it has the advantage of centralized management and supports a very large storage capacity.

In the field of distributed cache, Redis is currently widely used. This framework is pure memory storage, single-threaded execution of commands, rich underlying data structures, and supports multiple dimensions. of data storage and retrieval.

Of course, when the amount of data is large, various problems will arise, such as: data skew, data hotspots, etc.

What is data skew?

The hardware configuration of a single machine has an upper limit. Generally, we will use a distributed architecture to form a cluster of multiple machines. The cluster in the figure below is composed of three It consists of a single Redis machine. The client forwards read and write requests to specific instances through a certain routing strategy.

Due to the particularity of business data, according to the specified sharding rules, data may be unevenly distributed on different instances, and a large amount of data is concentrated on one or several machine nodes for calculation. , which leads to how loaded these nodes are, while other nodes are waiting idle, resulting in low overall efficiency.

Interviewer: How to solve Redis data skew, hot spots and other issues

What are the reasons for data skew?

1. There is a large key

For example, storing one or more String types bigKey data takes up a lot of memory.

Brother Tom has investigated this problem before. In order to save trouble during development, a colleague used JSON format to merge multiple business data into one value and only associated one key, which led to this The key-value pair capacity reaches several hundred M.

Frequent reading and writing of large keys consumes heavy memory resources and puts great pressure on network transmission, which in turn causes the request response to slow down and triggers an avalanche effect. In the end, the system A timeout alarm.

Solution:

The method is very simple, use Break it into partsThe strategy of splitting a bigKey into multiple small keys and maintaining them independently will reduce the cost a lot. Of course, this disassembly also pays attention to some principles. It is necessary to consider both business scenarios and access scenarios, and put them closely together.

For example: there is an RPC interface that has an internal dependence on Redis. In the past, all the data could be obtained by accessing it once. Splitting will control the size of the single value and the number of accesses. After all, an increase in the number of calls will increase the overall interface response time.

Government agencies in Zhejiang are advocating optimizing the process, and running it once at most is the same principle.

Interviewer: How to solve Redis data skew, hot spots and other issues

2. Improper use of HashTag

Redis uses a single thread to execute commands, thus ensuring atomicity. When cluster deployment is adopted, in order to solve multi-key batch operations such as mset and lua scripts, and to ensure that different keys can be routed to the same Redis instance, the HashTag mechanism is introduced.

Usage is also very simple, use {} braces, specify the key to only calculate the string within the braces Hash, thereby inserting key-value pairs of different keys into the same hash slot.

For example:

192.168.0.1:6380> CLUSTER KEYSLOT testtag
(integer) 764
192.168.0.1:6380> CLUSTER KEYSLOT {testtag}
(integer) 764
192.168.0.1:6380> CLUSTER KEYSLOT mykey1{testtag}
(integer) 764
192.168.0.1:6380> CLUSTER KEYSLOT mykey2{testtag}
(integer) 764

Check the business code and see if HashTag is introduced. , routing too many keys to one instance. Consider how to split based on specific scenarios.

Just like RocketMQ, in many cases our business needs can be met as long as the partitions are kept in order. In actual practice, we need to find this balance point, rather than solving problems for the sake of solving them.

3. Uneven distribution of slots

If the Redis Cluster deployment method is adopted, the database in the cluster is divided into 16384 slots (slot), each key in the database belongs to one of these 16384 slots, and each node in the cluster can handle 0 or up to 16384 slots.

You can manually migrate a relatively large slot to a slightly idle machine to ensure the uniformity of storage and access.

What are cache hotspots?

Cache hotspot means that most or even all business requests hit the same cached data, which puts huge pressure on the cache server, even exceeding the capacity of a single machine. The load limit is exceeded, causing server downtime.

solution:

1. Copy multiple copies

#We can spell sequential numbers after the key, such as key#01, key#02. . . Multiple copies of key#10, these processed keys are located on multiple cache nodes.

Every time the client accesses, it only needs to splice a random number with the upper limit of the number of shards based on the original key, and route the request to the instance node that cannot be routed.

Note: Cache generally sets expiration time. In order to avoid centralized cache failure, we try not to have the same cache expiration time. We can add a random number based on the preset.

As for the uniformity of data routing, this is guaranteed by the Hash algorithm.

2. Local memory cache

Cache hotspot data in the client's local memory and set an expiration time. For each read request, it will first check whether the data exists in the local cache. If it exists, it will be returned directly. If it does not exist, it will then access the distributed cache server.

The local memory cache completely "liberates" the cache server and does not put any pressure on the cache server.

Disadvantages: It is a bit troublesome to sense the latest cached data in real time, and data inconsistency may occur. We can set a relatively short expiration time and use passive updates. Of course, you can also use a monitoring mechanism to update the local cache in a timely manner if it senses that the data has changed.

Redis Cluster Why not use consistent Hash?

Redis Cluster cluster has 16384 hash slots , each key passes CRC16 after verification. ##16384Take a mold to determine which slot to place. Each node in the cluster is responsible for a part of the hash slot. For example, if the current cluster has 3 nodes, then node-1 contains numbers 0 to 5460. Hash slots, node-2 Contains hash slots 5461 to 10922, node-3 Contains hash slots 10922 to 16383.

Interviewer: How to solve Redis data skew, hot spots and other issues

The consistent hashing algorithm was proposed by Karger et al. of MIT in 1997 , in order to solve the problem of distributed caching.

The consistent hashing algorithm is essentially a modulo algorithm. Different from taking the modulo based on the number of servers, the consistent hashing modulo a fixed value 2^32.

Formula = hash (key) % 2^32

The result of the modulus must be Integers in the interval [0, 2^32-1], The first node found clockwise from the mapped position on the circle is the node where the key is stored

Interviewer: How to solve Redis data skew, hot spots and other issues

The consistent hash algorithm greatly alleviates the cache failure problem caused by expansion or shrinkage, and only affects the small section of keys that this node is responsible for. If there are not many machines in the cluster, and the load level of a single machine is usually very high, the pressure caused by the downtime of a certain node can easily trigger an avalanche effect.

##For example:

Redis There are a total of 4 machines in the cluster. Assuming that the data is distributed evenly, each machine will bear a quarter of the traffic. If a machine suddenly hangs up, the next machine in the clockwise direction will bear the extra quarter of the traffic. , it is still a bit scary to end up bearing one-half of the traffic.

But if CRC16 is calculated and combined with the binding relationship between the slot and the instance, whether Whether expanding or shrinking, you only need to smoothly migrate the data of the corresponding node's key, broadcast and store the new slot mapping relationship, without causing cache failure, and the flexibility is very high.

In addition, if there are differences in server node configurations, we can customize the slot numbers assigned to different nodes and adjust the load capabilities of different nodes, which is very convenient.

The above is the detailed content of Interviewer: How to solve Redis data skew, hot spots and other issues. For more information, please follow other related articles on the PHP Chinese website!

Statement

This article is reproduced at:Java后端技术全栈. If there is any infringement, please contact admin@php.cn delete