In the previous article, we mainly talked about some basic knowledge of redis. There has been no actual combat or problems encountered in practice. It will be boring for everyone. Today I will talk about it. Actual combat.
I believe these three problems have already appeared on the Internet Many friends have talked about it, but today I still want to say that I will draw more pictures to deepen everyone's impression. These three questions are also frequently asked in interviews, but being able to explain these questions clearly requires skills. [Related recommendations: Redis Video Tutorial]
When talking about these three questions, let’s first talk about the normal request process and look at the pictures to speak:
The meaning of the above picture is roughly as follows:
First of all, in your code, it may be tomcat or your rpc service, first determine whether the data you want exists in the cache. , if it is stored, it will be returned directly to the caller. If it does not exist, then you need to query the database, query the results, and then continue to cache them in the cache, and then return the results to the caller. The next time you query, you will also It hits the cache.
Definition
I remember that when I was doing the recommendation system before, some data was calculated by offline algorithms. , the requirement is to recommend similar products after reading this product. After this calculation, it will be stored in hbase and redis at the same time. Since they are all generated by batch algorithm, when stored in redis, if the expiration time is set to the same, then This will cause a large number of keys to expire at the same time, and a large number of requests will be hit to the background database. Because the throughput of the database is limited, it is very likely to bring down the database. This kind of The situation is a cache avalanche. Look at the picture to speak:
##This mainly explains the scenario where a cache avalanche occurs, especially when setting the cache in batches for scheduled tasks, you must pay attentionExpiration time setting.
How to prevent avalanches
In fact, it is very simple. When you set the cache time of the cache in batches, set a random number (such as The random number can be a number within 10 minutes, and the random number can be generated using Java's Random). In this way, there will not be a large number of keys, which will collectively fail at the same time. Look at the picture to speak:What if an avalanche really occurs?
The traffic is not very large, the database can withstand it, ok, congratulations on escaping. The traffic is very large, exceeding the limit of the number of requests that the database can handle. The database is down. Congratulations on getting a P0 incident ticket. The traffic is very large. If your database has a current-limiting scheme, when the parameters set by the current-limiting setting are reached, the request will be rejected, thus protecting the background db. Here’s a few words about current limiting. You can limit the large number of requests reaching the db end by setting the number of requests per second. Note that the number of requests per second here, or the number of concurrency, is not the current number of requests per second for the data. It can be set to Query the number of requests per second corresponding to a certain key. The purpose of this is to prevent a large number of requests for the same key from reaching the backend database, so that most of the requests can be intercepted. Look at the picture and talk: In this way, most requests for the same key will be limited, thus protecting the database db. In fact, current limiting is divided into two types: local current limiting and distributed current limiting. In the following articles, I will introduce local current limiting and distributed current limiting implemented by redis.Definition
For example, a website is conducting Double Eleven or engaging in flash sales and other operational activities , then the website traffic will generally be very large at this time. A certain product will become a hot item due to promotion, and the traffic will be super large. If this product, for some reason, fails in the cache at this time, then In an instant, the traffic of this key will flow to the database, and the db will finally be unable to hold on and go down. The consequences can be imagined, and other data cannot be queried normally. Look at the picture and talk: The huawei pro key in redis suddenly failed. It may have expired, or it may have been eliminated due to insufficient memory. Then there will be a large flow of requests arriving at redis, and it is found that redis does not have this key. Then these traffic will be transferred to DB to query the corresponding huawei pro. At this time, DB cannot stand it anymore and goes down.How to solve
In fact, in the final analysis, it is enough not to allow more traffic to reach the DB, so we just need to limit the traffic reaching the DB.
1. Current limiting
is similar to what was mentioned above. It mainly limits the traffic of a certain key. When this key is broken down, only one traffic is limited. Entering the db, others are rejected, or waiting to retry querying redis.
For the current limiting diagram, please refer to the cache breakdown current limiting diagram.
This will also be divided into local current limiting and distributed current limiting.
What is local current limiting? It means to limit the traffic of this key within the scope of a single local instance. It is only valid for the current instance.
What is distributed current limiting? It means that in a distributed environment, within the scope of multiple instances, the cumulative traffic limit of this key is the traffic from multiple instances. When the limit is reached, all instances will limit the traffic. Reach DB.
2. Using distributed locks
Here is a brief introduction to the definition of distributed locks. In concurrent scenarios, locks need to be used to ensure mutually exclusive access to shared resources. Thread safety; similarly, in a distributed scenario, a mechanism is also needed to ensure mutually exclusive access to multi-node shared resources, and the implementation mechanism is distributed lock.
The shared resource here is huawei pro in the example. That is, when accessing huawei pro in the db, it is necessary to ensure that only one thread or one flow is accessed to achieve the effect of distributed lock.
Look at the picture and talk:
Go to grab the lock:
After a large number of requests have not obtained the value of the huawei pro key, prepare Go to the db to obtain the data. At this time, the code for obtaining the db adds a distributed lock. Then each request and each thread will obtain the distributed lock of huawei pro (the distributed lock is implemented using redis in the figure. I will explain it later. A separate article will introduce the implementation of distributed locks, not limited to redis).
After acquiring the lock:
At this time, thread A acquires the distributed lock of huawei pro, then thread A will go to DB to load data, and then by Thread A sets huawei pro into the cache again, and then returns the data.
Other threads have not obtained it. One way is to directly return a null value to the client, and another way is to wait for 50-100ms, because querying the db and putting it into redis will be very fast. At this time, wait again. When querying, the result may be available. If not, null will be returned directly. Of course, you can also try again. Of course, in a large concurrency scenario, you still want to be able to return the result quickly and avoid too many retries.
3. Scheduled tasks update hotspot keys
This is easy to understand. To put it bluntly, it is a scheduled task that regularly monitors the timeout of certain hotspot keys. Whether it expires or not, just extend the cache time of the key in the cache when it is about to expire.
Single thread polling method checks and updates the expiration time, see the picture:
Multi-threaded method, Note that there should not be too many hotspot keys. A certain thread will open many. If there are many hotspot keys, you can use a thread pool. See the picture:
Delay Queue Implementation
To put it bluntly, whether it is a single thread or multiple threads, polling will be used (wasting CPU every time) to check whether the key is fast It has expired. This method of inspection will cause inaccurate inspection time, which may cause time delays or inaccuracies. When you are waiting for the next inspection, the key will be gone, and then a breakdown will have been issued at this time. , although the probability of this situation happening is low, it does happen, so how can we avoid it? In fact, we can use delay queue (ring queue) to achieve it. I will not go into the principle of this queue in depth here. You can Baidu or Google ), the so-called delay queue is when you send a message to this queue, hoping that it will be consumed according to the time you set. It will not be consumed before the time is up, and it will be consumed when the time is up. Okay, let’s talk about it by looking at the picture:
1. The program starts for the first time to obtain the expiration time of the keys in the list.
2. Set the key delayed consumption time in sequence. Note that the consumption time is earlier than the expiration time.
3. Delay the expiration of the queue and the consumer will consume the key.
4. The consumer consumes messages and delays the expiration time of the key to the cache.
5. Send the new key expiration time to the delay queue again and wait for the next expiration time of the delayed cache.
4. Set the key without invalidation
In fact, the key may be eliminated due to insufficient memory. You can think about the circumstances under which the key will be eliminated. .
Definition
The so-called penetration is to access a key that does not exist in the cache or in the database. , then it is equivalent to the traffic reaching the DB directly at this time. Then some rogues can take advantage of this vulnerability and frantically brush your interface, thereby destroying your DB and your business will not be able to run normally.
How to solve it?
1. Set null or special value
We can set null or specific value to redis without expiration, so we can come back next time time, just get the null or special value directly from redis.
This solution cannot solve the fundamental problem. If this traffic can forge a large number of useless keys, it will be useless no matter how many nulls or special values you set. So how should we solve it?
2. Bloom filter
Bloom filter is called bloomfiler in English. Here we will just give a brief introduction. Due to space reasons, there will be a separate section later. The article is an introduction.
For example, if our database stores tens of millions of sku data, our current requirement is that if the library has this sku, then query redis. If redis does not, query the database, and then update redis. We first What comes to mind is to put the sku data into a hashmap, and the key is sku. Because there are a lot of skus, the hashmap will take up a lot of memory space, which may burst the memory. In the end, the gain outweighs the loss. So how to save memory? , we can use a bit array to store the existence status of this sku, 0 represents not existing, 1 represents existence, we can use a hash function to calculate the hash value of sku, and then compare the hash value of sku to the bit array Take the modulo, find the position of the array, and then set it to 1. When the request comes, we will calculate whether the array position corresponding to the sku hash value is 1. If it is 1, it means it exists, and if it is 0, it means it does not exist. Such a simple bloomfilter is implemented. Bloomfiler has an error rate. You can consider increasing the array length and the number of hash functions to provide accuracy. Specifically, you can use Baidu or Google. I will not talk about it here today.
Let’s take a look at the process of using bloomfiler to prevent cache penetration. Look at the picture to speak:
The initialization of bloomfiler can read the db through a scheduled task, and initialize the size of the bit array, the default value All are 0, indicating that they do not exist, and then the array position corresponding to the hash value is calculated for each item, and then inserted into the bit array.
Request process, see the picture:
If you do not use the bloomfiler filter, there will be no existence in a database at all key, in fact, two IOs were wasted, one for querying redis and one for querying DB. With bloomfiler, these two useless IOs are saved and the waste of back-end redis and DB resources is reduced.
Today we talked about the problems and solutions encountered in high-frequency interviews and actual combat of redis cache.
Cache Avalanche
Solution:
When setting the expiration time period, add a random number of time , it can be done within a few minutes.
As well as the question of what to do if there is an avalanche, current limiting can be used.
Cache breakdown
Solution:
Current limiting
Distributed lock
Update hotspot keys regularly. Here we focus on the delay queue.
Set time without expiration
Cache penetration
Solution:
Set null or specific value to redis
Use bloomfiler to implement
For more programming related knowledge, please Visit: Introduction to Programming! !
The above is the detailed content of Let's talk about cache avalanche, cache breakdown and cache penetration in Redis. For more information, please follow other related articles on the PHP Chinese website!