Let's analyze Redis cache consistency, cache penetration, cache breakdown and cache avalanche issues together-Redis-php.cn

Home

Database

Redis

Let's analyze Redis cache consistency, cache penetration, cache breakdown and cache avalanche issues together

WBOYWBOYWBOYWBOYWBOYWBOYWBOYWBOYWBOYWBOYWBOYWBOYWB

May 19, 2022 am 10:12 AM

redis

This article brings you relevant knowledge about Redis, which mainly introduces the issues related to cache consistency, cache penetration, cache breakdown, cache avalanche and write synchronization of cached data. Let’s take a look at the issue of DB consistency. I hope it will be helpful to everyone.

Let's analyze Redis cache consistency, cache penetration, cache breakdown and cache avalanche issues together

Related recommendations: "Analysis of hot key storage problems in Redis and talk about solutions to cache exceptions"

(1) Cache invalidation consistency problem

The general way to use cache is: read the cache first, and if it does not exist, read it from the DB, and then The results are written to the cache; the next time the data is read, the data can be obtained directly from the cache. [Related recommendations: Redis Video Tutorial]

Data modification is to directly invalidate the cached data, and then modify the DB content to avoid that the DB modification is successful, but the cached data is not cleared due to network or other problems. , resulting in dirty data. But this still cannot avoid the generation of dirty data. In a concurrent scenario: Assume that the business has a large number of read and modify requests for the data Key:Hello Value:World. Thread A reads Key:Hello from OCS, gets the Not Found result, starts requesting data from DB, and gets the data Key:Hello Value:World; next, it prepares to write this data to OCS, but before writing to OCS (network, Waiting for the CPU may cause the processing speed of thread A to slow down.) Another thread B requests to modify the data Key:Hello Value:OCS and first performs the invalidation cache action (because thread B does not know whether this data exists, so it directly performs the invalidation operation). OCS successfully processed the invalid request. Return to thread A to continue writing OCS and write Key:Hello Value:World into the cache. Thread A's task ends; thread B also successfully modified the DB data content to Key:Hello Value:OCS. In order to solve this problem, OCS has expanded the Memcached protocol (public cloud will soon support it) and added the deleteAndIncVersion interface. This interface does not actually delete the data, but labels the data to indicate that it has expired, and increases the data version number; if the data does not exist, NULL is written, and a random data version number is also generated. OCS writing supports atomic comparison of version numbers: assuming the incoming version number is consistent with the data version number saved by OCS or the original data does not exist, writing is allowed, otherwise modification is refused.

Back to the scene just now: Thread A reads Key:Hello from OCS, gets the Not Found result, starts requesting data from DB, and gets the data Key:Hello Value:World; then prepares to write to OCS For this piece of data, the version number information defaults to 1; before A writes to OCS, another thread B initiates an action to modify the data Key:Hello Value:OCS. It first performs the delete cache action. OCS successfully handles the deleteAndIncVersion request and generates a random version. No. 12345 (agreed to be greater than 1000). Return to thread A and continue writing to OCS, requesting to write Key:Hello Value:World. At this time, the cache system finds that the incoming version number information does not match (1! = 12345), the writing fails, and the task of thread A ends. ;Thread B also successfully modified the DB data content to Key:Hello Value:OCS.

At this time, the data in OCS is Key:Hello Value:NULL Version:12345; the data in DB is Key:Hello Value:OCS. In subsequent read tasks, the data in DB will be tried again to write to in OCS.

(2) The write synchronization of cached data and the consistency problem with DB

As the website scale grows and reliability improves, it will face the deployment of multiple IDCs. Each IDC has an independent DB and cache system, and cache consistency has become a prominent issue.

First of all, in order to ensure high efficiency, the cache system will prevent disk IO, even if it is writing BINLOG; of course, for the sake of performance, the cache system can only delete synchronously and not write synchronously, so the cache synchronization will generally take precedence over the DB synchronization arrival (After all, the cache system is much more efficient), then there will be a scenario where there is no data in the cache and old data in the DB. At this time, there is a business request for data, and the read cache is Not Found. The old data read from the DB and loaded into the cache is still old data. When the DB data synchronization arrives, only the DB is updated, and the cached dirty data cannot be cleared.

Lets analyze Redis cache consistency, cache penetration, cache breakdown and cache avalanche issues together

As can be seen from the above situation, the root cause of the inconsistency is that heterogeneous systems cannot synchronize collaboratively. It cannot guarantee that DB data is synchronized first and cached data is synchronized later. So we need to consider how the cache system waits for DB synchronization, or can the two share a synchronization mechanism? Cache synchronization also relies on DB BINLOG which is a feasible solution.

The DB in IDC1 is synchronized to the DB in IDC2 through BINLOG. In this case, IDC2-DB data modification will also generate its own BINLOG. The cached data synchronization can be performed through IDC2-DB BINLOG. After the cache synchronization module analyzes the BINLOG, it invalidates the corresponding cache key and changes the synchronization from parallel to serial, ensuring the order.

(3) Cache penetration (DB suffered unnecessary query traffic)

Method 1: It is a Bloom filter. It is an extremely space-efficient probabilistic algorithm and data structure, used to determine whether an element is in a set (similar to Hashset). Its core is a long binary vector and a series of hash functions. Implement bloom filter using Google's guava. 1) There is a miscalculation rate. As the number of stored elements increases, the miscalculation rate also increases. 2) Under normal circumstances, elements cannot be deleted from the Bloom filter. 3) The process of determining the array length and the number of hash functions is complex, and the distribution What are the usage scenarios of Long filter? 1) Spam address filtering (the number of addresses is huge) 2) Crawler URL address deduplication 3) Solve the cache breakdown problem

Method 2: Store empty results and set the time for empty results

(4) Cache avalanche (the cache is set to the same expiration time, causing a DB flood)

Method 1: Most system designers consider using locks or queues to ensure cache single Threads (processes) write to avoid a large number of concurrent requests falling on the underlying storage system when they fail

Method 2: Random value of failure time

(5) Cache breakdown (hot spot) Key, a small avalanche caused by a large number of concurrent read requests)

When the cache expires at a certain point in time, there happens to be a large number of concurrent requests for this Key at this point in time. These requests If the cache is found to have expired, the data will usually be loaded from the back-end DB and reset to the cache. At this time, large concurrent requests may instantly overwhelm the back-end DB

Method 1: 1. Use distributed cache For the supported mutex key, set a mutex key. When the operation returns successfully, the load DB operation is performed and the cache is set back. That is, load DB will only be processed by one thread.

Method 2: Use mutex key in advance: Set a timeout value (timeout1) inside value, timeout1 is smaller than the actual memcache timeout (timeout2). When timeout1 is read from cache When you find that it has expired, immediately extend timeout1 and reset it to the cache. Then load the data from the database and set it to the cache. This increases the intrusion of business code and increases the complexity of coding

Method 3 : "Never expires": From the perspective of redis, there is indeed no expiration time set, which ensures that there will be no hotspot key expiration problem, that is, "physical" does not expire. From a functional point of view, if it does not expire, then it does not Is it static? So we store the expiration time in the value corresponding to the key. If it is found that it is about to expire, the cache is constructed through a background asynchronous thread, which is a "logical" expiration

(6) Common cache full and data loss problems in cache systems

Need to be based on specific business analysis. Usually we use the LRU strategy to handle overflow, and Redis's RDB and AOF persistence strategies to ensure certain situations. Data security under.

For more programming-related knowledge, please visit:Programming Video!!

The above is the detailed content of Let's analyze Redis cache consistency, cache penetration, cache breakdown and cache avalanche issues together. For more information, please follow other related articles on the PHP Chinese website!

Statement

This article is reproduced at:简书. If there is any infringement, please contact admin@php.cn delete

Redis: A Guide to Popular Data StructuresApr 11, 2025 am 12:04 AM

Redis supports a variety of data structures, including: 1. String, suitable for storing single-value data; 2. List, suitable for queues and stacks; 3. Set, used for storing non-duplicate data; 4. Ordered Set, suitable for ranking lists and priority queues; 5. Hash table, suitable for storing object or structured data.

How to implement redis counterApr 10, 2025 pm 10:21 PM

Redis counter is a mechanism that uses Redis key-value pair storage to implement counting operations, including the following steps: creating counter keys, increasing counts, decreasing counts, resetting counts, and obtaining counts. The advantages of Redis counters include fast speed, high concurrency, durability and simplicity and ease of use. It can be used in scenarios such as user access counting, real-time metric tracking, game scores and rankings, and order processing counting.

How to use the redis command lineApr 10, 2025 pm 10:18 PM

Use the Redis command line tool (redis-cli) to manage and operate Redis through the following steps: Connect to the server, specify the address and port. Send commands to the server using the command name and parameters. Use the HELP command to view help information for a specific command. Use the QUIT command to exit the command line tool.

How to build the redis cluster modeApr 10, 2025 pm 10:15 PM

Redis cluster mode deploys Redis instances to multiple servers through sharding, improving scalability and availability. The construction steps are as follows: Create odd Redis instances with different ports; Create 3 sentinel instances, monitor Redis instances and failover; configure sentinel configuration files, add monitoring Redis instance information and failover settings; configure Redis instance configuration files, enable cluster mode and specify the cluster information file path; create nodes.conf file, containing information of each Redis instance; start the cluster, execute the create command to create a cluster and specify the number of replicas; log in to the cluster to execute the CLUSTER INFO command to verify the cluster status; make

How to read redis queueApr 10, 2025 pm 10:12 PM

To read a queue from Redis, you need to get the queue name, read the elements using the LPOP command, and process the empty queue. The specific steps are as follows: Get the queue name: name it with the prefix of "queue:" such as "queue:my-queue". Use the LPOP command: Eject the element from the head of the queue and return its value, such as LPOP queue:my-queue. Processing empty queues: If the queue is empty, LPOP returns nil, and you can check whether the queue exists before reading the element.

How to use redis cluster zsetApr 10, 2025 pm 10:09 PM

Use of zset in Redis cluster: zset is an ordered collection that associates elements with scores. Sharding strategy: a. Hash sharding: Distribute the hash value according to the zset key. b. Range sharding: divide into ranges according to element scores, and assign each range to different nodes. Read and write operations: a. Read operations: If the zset key belongs to the shard of the current node, it will be processed locally; otherwise, it will be routed to the corresponding shard. b. Write operation: Always routed to shards holding the zset key.

How to clear redis dataApr 10, 2025 pm 10:06 PM

How to clear Redis data: Use the FLUSHALL command to clear all key values. Use the FLUSHDB command to clear the key value of the currently selected database. Use SELECT to switch databases, and then use FLUSHDB to clear multiple databases. Use the DEL command to delete a specific key. Use the redis-cli tool to clear the data.

How to set the redis expiration policyApr 10, 2025 pm 10:03 PM

There are two types of Redis data expiration strategies: periodic deletion: periodic scan to delete the expired key, which can be set through expired-time-cap-remove-count and expired-time-cap-remove-delay parameters. Lazy Deletion: Check for deletion expired keys only when keys are read or written. They can be set through lazyfree-lazy-eviction, lazyfree-lazy-expire, lazyfree-lazy-user-del parameters.

See all articles