Redis is a completely open source, BSD-compliant, high-performance key-value data structure The storage system supports data persistence and can store data in memory on disk. It not only supports simple key-value type data, but also provides storage of data structures such as list, set, zset, hash, etc. The function is very powerful. Redis also supports data backup, that is, data backup in master-slave mode, thereby improving availability. Fast read and write speed is the most critical, because as the most frequently used caching solution in our daily development, it has been widely used. However, in the actual application process, it will have abnormal situations such as cache avalanche, cache breakdown and cache penetration. If these situations are ignored, it may bring catastrophic consequences. The following mainly analyzes these cache exceptions and common processing solutions. and summary.
A large number of requests that should have been processed in the redis cache within a period of time were sent to The database is processed, causing the pressure on the database to increase rapidly. In severe cases, it may even cause the database to crash, causing the entire system to collapse, just like an avalanche, triggering a chain effect, so it is called cache avalanche.
The common reasons for the above situation are mainly the following two points:
A large amount of cached data at the same time Expiration, resulting in the need to re-obtain data from the database that should have been requested from the cache.
If redis itself fails and cannot handle the request, it will naturally make another request to the database.
For the situation where a large amount of cached data expires at the same time:
When actually setting the expiration time, you should try to avoid scenarios where a large number of keys expire at the same time. If it does happen, set the expiration time through randomization, fine-tuning, and even settings to avoid expiration at the same time.
Add a mutex so that the operations of building the cache will not be performed at the same time.
Dual key strategy, the primary key is the original cache, and the backup key is the copy cache. When the primary key fails, the backup key can be accessed. The cache expiration time of the primary key is set to short-term, and the backup key is set for the long term.
Background update cache strategy, using scheduled tasks or message queues to update or remove redis cache, etc.
In case of redis itself failure:
At the prevention level, it can be constructed through master-slave nodes A high-availability cluster means that after the main Redis instance hangs up, other slave libraries can quickly switch to the main library and continue to provide services.
Once an event occurs, in order to prevent the database from crashing due to too many requests, a service circuit breaker or request current limiting strategy can be adopted. Of course, the service circuit breaker is relatively rough, stopping the service until the redis service is restored, and the request current limit is relatively mild, ensuring that some requests can be processed. It is not a one-size-fits-all approach, but it is still necessary to choose an appropriate solution based on the specific business situation.
Cache breakdown generally occurs in high-concurrency systems , a large number of concurrent users simultaneously request data that is not in the cache but exists in the database, that is, they simultaneously read the cache but not the data, and go to the database to retrieve the data at the same time, causing an instant increase in database pressure. Different from cache avalanche, cache breakdown refers to concurrent query of the same data. Cache avalanche means that different data have expired, and a lot of data cannot be found, so the database is searched.
The common reason for this situation is that a certain hot data cache has expired. Since it is hot data and the number of concurrent requests is large, there will still be a problem when it expires. A large number of requests come in at the same time, and they all hit the database before the cache can be updated.
There are two common solutions for this situation:
Simple and crude Do not set an expiration time for hotspot data, so it will not expire, and naturally the above situation will not occur. If you want to clean it up later, you can do it through the background.
Add a mutex lock, that is, after it expires, except for the first query request, the lock request can be obtained from the database and updated to the cache again, and the others will be It is blocked until the lock is released and the new cache is updated. Subsequent requests will be made to the cache, so that cache breakdown will not occur.
Cache penetration means that the data is neither in redis nor in redis In the database, this results in that every time a request comes in, after the corresponding key cannot be found in the cache, the database has to be searched again every time, and it is found that the database does not have it, which is equivalent to two useless queries. In this way, requests can bypass the cache and directly check the database. If someone wants to maliciously attack the system at this time, they can deliberately use null values or other non-existent values to make frequent requests, which will put greater pressure on the database.
The reason for this phenomenon is actually easy to understand. In the business logic, if the user has not performed corresponding operations or processing on certain information, then the corresponding storage information Naturally, there is no corresponding data in the database or cache, and the above problems are prone to occur.
For cache penetration, there are generally three solutions:
Illegal request The restrictions mainly refer to parameter verification, authentication verification, etc., so as to intercept a large number of illegal requests from the beginning, which is a necessary means in actual business development.
Cache empty values or default values. If the data that cannot be obtained from the cache is not obtained in the database, then we will still cache the empty result and set a larger value. Short expiration time. The default value of this setting is stored in the cache, so that it will have value the second time it is retrieved from the cache without continuing to access the database. This can prevent a large number of malicious requests from repeatedly using the same key to attack.
Use Bloom filter to quickly determine whether data exists. So what is a Bloom filter? To put it simply, it can introduce multiple independent hash functions to ensure that element weight determination is completed within a given space and misjudgment rate. Because we know that there is such a situation as hash collision, then if only one hash function is used, the probability of collision will obviously increase. In order to reduce this conflict, we can introduce several more hash functions, and Bloom filtering The core idea of the hash algorithm is to use multiple different hash functions to resolve such a conflict. Its advantages are high space efficiency and short query time, far exceeding other algorithms. Its disadvantage is that there will be a certain misrecognition rate. It cannot completely guarantee that the requested key will pass the verification of the Bloom filter. With this data, after all, theoretically there will still be conflicts, no matter how small the probability. However, as long as it does not pass the Bloom filter verification, then the key must not exist. As long as this is used, you can actually filter out most requests for non-existent keys, which is enough in normal scenarios.
In addition to the above three common Redis cache exception problems, cache preheating and The two terms cache downgrade are not so much exception problems as they are two optimization methods.
Before and after the system goes online, cache preheating will load relevant cache data directly into the cache system without relying on user operations. This can avoid the problem of querying the database first and then caching the data when the user requests it. Users directly query the cached data that has been preheated in advance. This can avoid high-concurrency traffic accessing the database in the early stages of system launch, causing traffic pressure on the database.
According to different magnitudes of data, the following methods can be used:
The amount of data is not large: Project Start is loaded automatically.
Large amount of data: Refresh the cache regularly in the background.
The amount of data is extremely large: Only perform preload cache operations for hotspot data.
Cache downgrade means that when the cache fails or there is a problem with the cache service, in order to prevent the cache service from failing, causing an avalanche problem in the database. , so I don’t access the database, but for some reasons, I still want to ensure that the service is basically available, although it will definitely damage the service. Therefore, for unimportant cached data, we can adopt a service degradation strategy.
There are two general approaches:
Directly access the data cache in the memory part.
Directly returns the default value set by the system.
The above is the detailed content of How to deal with three major exceptions in Redis cache. For more information, please follow other related articles on the PHP Chinese website!