Home >Database >Redis >What are the technical points of Redis?

What are the technical points of Redis?

PHPz
PHPzforward
2023-06-04 08:27:091062browse

1. Why use Redis

When using Redis in a project, the author believes that performance and concurrency need to be considered. Of course, Redis also has other functions that can do distributed locks and other functions, but if it is just for other functions such as distributed locks, there are other middleware (such as Zookpeer, etc.) that can be used instead, and it is not necessary to use Redis.

Therefore, this question is mainly answered from two perspectives: performance and concurrency:

##1. Performance

As shown in the figure below, when we encounter SQL that takes a particularly long time to execute and the results do not change frequently, it is especially suitable to put the running results into the cache. In this way, subsequent requests will be read from the cache, so that requests can be responded to quickly.

What are the technical points of Redis?

Digression: I suddenly want to talk about the standard for rapid response - in fact, there is no fixed standard for this response time depending on the interaction effect. Someone once expressed it to me: "Ideally, our page jumps should be completed instantly, and in-page operations need to be completed in an instant.". In addition, time-consuming operations that take more than a snap of a finger should have progress prompts and can be suspended or canceled at any time, so as to give users the best experience. "

So how much time is an instant, a moment, a snap of the finger?

According to the "Maha Sangha Vinaya" records: One moment is one thought, twenty thoughts are one moment, and twenty seconds are There are one snap of fingers, twenty snaps of fingers are one stroke, twenty strokes are one moment, and there are thirty moments in one day and one night.

So, after careful calculation, one moment is 0.36 seconds, and one moment is 0.018 Seconds, up to 7.2 seconds at the snap of a finger.

2. Concurrency

When the concurrency is very high, all Requesting direct access to the database will cause a database connection exception, as shown in the figure. In order to avoid direct access to the database, we can use Redis for buffering in this case and let the request access Redis first.

What are the technical points of Redis?

2. What are the disadvantages of using Redis

Everyone uses it Redis has been around for so long. This issue must be understood. Basically, you will encounter some problems when using Redis. The common problems are mainly four aspects:

1. Cache and database double-write consistency issues

2. Cache avalanche problem

3. Cache breakdown problem

4. Cache concurrency competition problem

The author personally feels that these four problems are important in the project It is relatively common, and the specific solution will be given later.

3. Why is single-threaded Redis so fast?

This question is actually an investigation of the internal mechanism of Redis. In the author's interview experience, many people actually did not realize that Redis uses a single-threaded working model. Therefore, This issue should still be reviewed. Mainly the following three points:

1. Pure memory operation

2. Single-thread operation, avoiding frequent context switching

3. Adopt a non-blocking I/O multiplexing mechanism

Let us discuss the I/O multiplexing mechanism in more detail, because this term is too popular and ordinary people cannot understand its meaning. Hit An example: Xiaoqu opened a courier store in City S, responsible for intra-city express delivery services. Due to financial constraints, Xiaoqu initially hired many couriers, but later found that only by purchasing a car could he have enough funds to operate express delivery.

Business method one:

Every time a customer delivers a courier, Xiaoqu lets a courier keep an eye on it, and then the courier drives to deliver the courier. Gradually, Xiaoqu discovers the existence of this business method There are many problems. Dozens of couriers basically spend their time grabbing cars. Most of the couriers are idle. Whoever grabs the car can deliver the express.

With the As the number of express delivery increased, so did the number of couriers. Xiaoqu found that the express delivery store was getting more and more crowded, and there was no way to hire new couriers. Coordination between couriers took a lot of time, and most of the time was spent fighting for cars. Based on the above shortcomings, Xiaoqu learned from the experience and proposed the following business method↓

Business method two:

Xiaoqu only hired one courier, and marked the express delivery sent by the customer according to the destination. , neatly placed in one place. Finally, the courier picks up the packages in order, one by one, drives out the package and then returns to get the next package.

Comparing the above two business methods, do you think the second one is more efficient and better? In the above metaphor:

1. Each courier → each thread

2. Each courier → each Socket (I/O stream)

3. The delivery location of the express delivery → the different states of the Socket

4. The customer’s request to send the express delivery → the request from the client

5. Xiaoqu’s business method → ​​the code running on the server

6. One car → Number of CPU cores

So we have the following conclusions:

1. The first business method is the traditional concurrency model, each I/O flow (express) There is a new thread (courier) managed.

2. The second management method is I/O multiplexing. A courier manages multiple I/O flows by tracking the status of each I/O flow. It is similar to a courier who only has one person to deliver each package and needs to know the delivery status of each package.

The following is an analogy to the real Redis thread model, as shown in the figure:

What are the technical points of Redis?

Referring to the above figure, to put it simply, our Redis-client is During operation, Sockets with different event types will be generated. On the server side, there is an I/O multiplexing program that puts it into a queue. Then the file event dispatcher takes it from the queue in turn and forwards it to different event handlers.

It should be noted that for this I/O multiplexing mechanism, Redis also provides multiplexing function libraries such as Select, Epoll, Evport, and Kqueue. You can learn about it by yourself.

4. Redis data types and respective usage scenarios

When you see this question, do you think it is very basic? In fact, I think so too. However, according to interview experience, at least 80% of people cannot answer this question. It is recommended that after using it in the project, you can memorize it by analogy to gain a deeper experience instead of memorizing it by heart. Basically, a qualified programmer will use five types:

1, String

This is actually There is nothing much to say. For the most common Set/Get operations, Value can be either a String or a number. Generally, some complex counting functions are cached.

2. Hash

The Value here is a variable containing a structured object, which allows easy manipulation specific fields in it. When the author is doing single sign-on, I use this data structure to store user information, use CookieId as the Key, and set 30 minutes as the cache expiration time, which can simulate a Session-like effect very well.

3. List

Using the data structure of List, you can perform simple message queue functions. In addition, you can also use Redis's Lrange command to implement the paging function, which has excellent performance and can provide a good user experience.

4. Set

Because Set stacks a collection of unique values, it can be used globally Duplicate removal function.

Why not use the Set that comes with the JVM for deduplication? Because our systems are generally deployed in clusters, it is troublesome to use the Set that comes with the JVM. Is it necessary to create a public service for the purpose of global deduplication? Too much trouble.

In addition, by using operations such as intersection, union, and difference, you can calculate common preferences, all preferences, and your own unique preferences.

5. Sorted Set

By assigning the Score parameter to the elements in the set, the Sorted Set can be sorted based on the Score. Elements are sorted. You can make a ranking application and take TOP N operations. In addition, Sorted Set can also be used to perform delayed tasks. The last application is to do range searches.

5. Redis expiration strategy and memory elimination mechanism

The importance of this question is self-evident. It can reveal whether Redis has been applied correctly. For example, if your Redis can only store 5G of data and you write 10G, 5G of data will be deleted. How was it deleted? Have you thought about this issue? Also, your data has set an expiration time, but when the time is up, the memory usage is still relatively high. Have you thought about the reason?

Redis adopts a regular deletion and lazy deletion strategy.

Why not use a scheduled deletion strategy?

Regular deletion, Use a timer to monitor the Key, and it will be automatically deleted when it expires. Although the memory is released in time, it consumes a lot of CPU resources. Under large concurrent requests, the CPU should use time to process the request instead of deleting the key, so this strategy is not adopted.

Regular deletion How does lazy deletion work?

Regular deletion, Redis defaults to checking whether there are expired keys every 100ms, and if there are expired keys, delete them. It should be noted that Redis does not check all Keys every 100ms, but randomly selects and checks them (if all Keys are checked every 100ms, Redis will not be stuck). Therefore, if you only adopt a periodic deletion strategy, many keys will not be deleted until the end of time.

So, lazy deletion comes in handy. That is to say, when you get a Key, Redis will check. If the expiration time is set for this Key, has it expired? If it expires, it will be deleted at this time.

Is there no other problem with regular deletion and lazy deletion?

No, the Key is not deleted if it is deleted regularly. Then you did not request the Key in time, which means that lazy deletion did not take effect. In order to prevent the memory of Redis from continuously increasing, the memory elimination mechanism needs to be enabled.

There is a line of configuration in Redis.conf:

# maxmemory-policy volatile-lru

This configuration is equipped with the memory elimination strategy:

Noeviction: When the memory is insufficient to accommodate the newly written data, the new write operation will report an error. No one should use it;

Allkeys-lru: When the memory is insufficient to accommodate newly written data, in the key space, remove the least recently used Key. Recommended, currently the project is using this;

Allkeys-random:When the memory is not enough to accommodate the newly written data, in the key space, randomly remove a key, it should No one uses it;

Volatile-lru: When the memory is insufficient to accommodate newly written data, in the key space with an expiration time set, remove the least recently used Key . This situation is generally used when Redis is used as both cache and persistent storage. Not recommended;

Volatile-random: When the memory is not enough to accommodate newly written data, a Key is randomly removed from the key space with an expiration time set. Still not recommended;

Volatile-ttl: When the memory is insufficient to accommodate newly written data, in the key space with an expiration time set, keys with earlier expiration times will be moved first. remove. Not recommended.

PS: If the Expire Key is not set and the prerequisites are not met, then the behavior of the Volatile-lru, Volatile-random and Volatile-ttl strategies is basically the same as Noeviction (not deleted).

6. Redis and database double-write consistency issue

Consistency issues are common distributed problems and can be further divided into final consistency and strong consistency. If the database and cache are double-written, there will inevitably be inconsistencies. To answer this question, we must first understand a premise: if there are strong consistency requirements for the data, the cache cannot be placed. Everything we do can only guarantee eventual consistency.

The solution we proposed can only reduce the possibility of inconsistency, but cannot completely eliminate it. Therefore, data with strong consistency requirements cannot be cached.

"Distributed Database and Cache Double-Write Consistency Solution"

gives a detailed analysis. Here is a brief explanation: First, adopt the correct update strategy and update the database first. , and then delete the cache; secondly, because there may be a problem of failure to delete the cache, just provide a compensation measure, such as using a message queue.

7. Dealing with cache penetration and cache avalanche issues

Generally, small and medium-sized traditional software companies rarely encounter the two problems of cache penetration and cache avalanche. If you want to handle millions of traffic, these two issues must be carefully considered:

1. Dealing with cache penetration

Cache penetration means that hackers deliberately request data that does not exist in the cache, causing all requests to be sent to the database, causing the database connection to be abnormal.

Solution:

Use a mutex lock. When the cache fails, first obtain the lock. Once the lock is obtained, then request the database. If the lock is not obtained, it will sleep for a period of time and try again;

1. Adopt an asynchronous update strategy, and return directly regardless of whether the Key has a value. A cache expiration time is maintained in the Value value. If the cache expires, a thread will be started asynchronously to read the database and update the cache. A cache warm-up operation (loading the cache before starting the project) is required;

2. Provide a An interception mechanism that can quickly determine whether a request is valid, such as using a Bloom filter to internally maintain a series of legal and valid Keys and quickly determine whether the Key carried in the request is legal and valid. If not, it will be returned directly.

2. Dealing with cache avalanche

Cache avalanche, that is, the cache fails in a large area at the same time. At this time, another A wave of requests came, and as a result, the requests were all sent to the database, causing the database connection to be abnormal.

Solution:

1. Add a random value to the cache expiration time to avoid collective failure;

2. Use a mutex lock, but the throughput of this solution It has dropped significantly;

3. Double buffering. We have two caches, cache A and cache B. The expiration time of cache A is 20 minutes, and there is no expiration time for cache B. The cache preheating operation is performed by itself.

Then break down the following points:

a. Read the database from cache A, and return directly if there is any;

b. If A has no data, directly return Read data from B, return directly, and start an update thread asynchronously;

c. The update thread updates cache A and cache B at the same time.

8. How to solve the Redis concurrency competition Key problem

This problem is roughly that there are multiple subsystems setting a Key at the same time. In this case, care should be taken to use the Redis transaction mechanism. According to my search results on Baidu in advance, most people recommend this method. But I don’t recommend using Redis’ transaction mechanism. We mainly use Redis clusters in the production environment and perform data sharding. When you involve multiple Key operations in a transaction, these Keys are not necessarily stored on the same Redis-Server. Therefore, the transaction mechanism of Redis is very useless.

The solution is as follows:

If the order of this Key operation is not required

In this case, prepare a distributed lock for everyone to grab Lock, just do the Set operation when you grab the lock, which is relatively simple.

If the sequence is required for this Key operation

Suppose there is a Key1. System A needs to set Key1 to ValueA, system B needs to set Key1 to ValueB, and system C needs to set Key1 to ValueC. It is hoped that the value of Key1 will change in the order of ValueA→ValueB→ValueC. At this time, we need to save a timestamp when writing data to the database. Assume that the timestamp is as follows:

1. System A Key 1 {ValueA 3:00}

2. System B Key 1 {ValueB 3:05}

3. System C Key 1 {ValueC 3:10}

So, assuming that system B grabs the lock first and sets Key1 to {ValueB 3:05}. If system A grabs the lock and finds that the timestamp of ValueA is earlier than the timestamp in the cache, the Set operation will not be performed. And so on.

The above is the detailed content of What are the technical points of Redis?. For more information, please follow other related articles on the PHP Chinese website!

Statement:
This article is reproduced at:yisu.com. If there is any infringement, please contact admin@php.cn delete