Redis distributed lock｜Five evolution plans from bronze to diamond-javaTutorial-php.cn

Home

Java

javaTutorial

Redis distributed lock｜Five evolution plans from bronze to diamond

Java后端技术全栈

Aug 23, 2023 pm 02:54 PM

redis

The main contents of this article are as follows:

##1. The problem of local lock

First of all, let’s review the problem of local locks:

The current microservice in question is split into four microservices. When front-end requests come in, they will be forwarded to different microservices. If the front end receives 10 W requests, and each microservice receives 2.5 W requests, if the cache fails, each microservice locks when accessing the database, through the lock (

synchronzied or lock ) to lock its own thread resources to prevent cache breakdown.

This is a

local locking method, which will cause data inconsistency in distributed situations: for example, after service A obtains data, it updates the cache key =100, service B is not subject to the lock restriction of service A, and concurrently updates the cache key = 99. The final result may be 99 or 100, but this is an unknown state, is inconsistent with the expected result . The flow chart is as follows:

2. What is a distributed lock

Based on the above local lock problem, we need a support distributed cluster environment Lock under : When querying the DB, only one thread can access it, and other threads need to wait for the first thread to release the lock resource before they can continue execution.

Cases in life: The lock can be regarded as a lock outside the door, and all concurrent threads are compared to people, they Everyone wants to enter the room, but only one person can enter the room. When someone enters, lock the door and others must wait until the person who entered comes out.

Let’s take a look at the basic principles of distributed locks, as shown in the figure below:

Let’s analyze it Distributed lock in the picture above:

1. The front-end forwards 10W high-concurrency requests to four topic microservices.
2. Each microservice handles 2.5 W requests.
3. Each thread processing a request needs to seize the lock before executing the business. It can be understood as "occupying a pit".
4. The thread that acquired the lock releases the lock after completing the business. It can be understood as "releasing the pit".
5. The thread that has not been acquired needs to wait for the lock to be released.
6. After the lock is released, other threads seize the lock.
7. Repeat steps 4, 5, and 6.

Explanation in vernacular: All requested threads go to the same place"Occupy the pit". If there is a pit, the business logic will be executed. If there is no pit, You need other threads to release the "pit". This pit is visible to all threads. You can put this pit in the Redis cache or database. This article talks about how to use Redis to make "distributed pits".

3. SETNX of Redis

As a publicly accessible place, Redis can be used as a place to "take advantage of".

Several solutions for implementing distributed locks using Redis, we all use the SETNX command (setting key equal to a certain value). Only the number of parameters passed in the high-end scheme is different, and abnormal situations are taken into account.

Let’s take a look at this command. SETNX is the abbreviation of set If not exist. This means that when the key does not exist, set the value of the key, and when it exists, do nothing.

This is how it is executed in the Redis command line:

set <key> <value> NX

We can enter the redis container to try the SETNX command.

Enter the container first:

docker exec -it <容器 id> redis-cli

然后执行 SETNX 命令：将 wukong 这个 key 对应的 value 设置成 1111。

set wukong 1111 NX

返回 OK，表示设置成功。重复执行该命令，返回 nil表示设置失败。

四、青铜方案

我们先用 Redis 的 SETNX 命令来实现最简单的分布式锁。

3.1 青铜原理

我们来看下流程图：

多个并发线程都去 Redis 中申请锁，也就是执行 setnx 命令，假设线程 A 执行成功，说明当前线程 A 获得了。
其他线程执行 setnx 命令都会是失败的，所以需要等待线程 A 释放锁。
线程 A 执行完自己的业务后，删除锁。
其他线程继续抢占锁，也就是执行 setnx 命令。因为线程 A 已经删除了锁，所以又有其他线程可以抢占到锁了。

代码示例如下，Java 中 setnx 命令对应的代码为 setIfAbsent。

setIfAbsent 方法的第一个参数代表 key，第二个参数代表值。

// 1.先抢占锁
Boolean lock = redisTemplate.opsForValue().setIfAbsent("lock", "123");
if(lock) {
  // 2.抢占成功，执行业务
  List<TypeEntity> typeEntityListFromDb = getDataFromDB();
  // 3.解锁
  redisTemplate.delete("lock");
  return typeEntityListFromDb;
} else {
  // 4.休眠一段时间
  sleep(100);
  // 5.抢占失败，等待锁释放
  return getTypeEntityListByRedisDistributedLock();
}

一个小问题：那为什么需要休眠一段时间？

因为该程序存在递归调用，可能会导致栈空间溢出。

3.2 Defects of the Bronze Solution

The reason why bronze is called bronze is because it is the most elementary and will definitely cause many problems.

Imagine a family scene: At night, Xiao Kong unlocks the door alone and enters the room, turns on the light?, and then suddenly the power is cut off, Xiao Kong wants to open the door and go out. But if the door lock position cannot be found, Xiao Ming cannot get in, and neither can anyone outside.

From a technical point of view: setnx successfully occupied the lock, the business code was abnormal or the server was down, and the logic of deleting the lock was not executed, resulting in a

deadlock. .

So how to avoid this risk?

Set the lock's

automatic expiration time. After a period of time, the lock will be automatically deleted so that other threads can acquire the lock.

4. Silver Solution

4.1 Examples in Life

The bronze solution mentioned above will have deadlock problems , then we will use the above risk-avoiding plan to design it, which is our silver plan.

Still an example from life: After Xiao Kong successfully unlocked the lock, he set an

hourglass countdown⏳ for the smart lock. After the hourglass was completed, the door The lock opens automatically. Even if there is a sudden power outage in the room, the lock will automatically open after a while and others can come in.

4.2 Technical Schematic

The difference from the bronze solution is that after successfully occupying the lock, set the expiration time of the lock. These two steps are performed step by step. . As shown below:

4.3 示例代码

清理 redis key 的代码如下

// 在 10s 以后，自动清理 lock
redisTemplate.expire("lock", 10, TimeUnit.SECONDS);

完整代码如下：

// 1.先抢占锁
Boolean lock = redisTemplate.opsForValue().setIfAbsent("lock", "123");
if(lock) {
    // 2.在 10s 以后，自动清理 lock
    redisTemplate.expire("lock", 10, TimeUnit.SECONDS);
    // 3.抢占成功，执行业务
    List<TypeEntity> typeEntityListFromDb = getDataFromDB();
    // 4.解锁
    redisTemplate.delete("lock");
    return typeEntityListFromDb;
}

4.4 白银方案的缺陷

白银方案看似解决了线程异常或服务器宕机造成的锁未释放的问题，但还是存在其他问题：

因为占锁和设置过期时间是分两步执行的，所以如果在这两步之间发生了异常，则锁的过期时间根本就没有设置成功。

所以和青铜方案有一样的问题：锁永远不能过期。

五、黄金方案

5.1 原子指令

上面的白银方案中，占锁和设置锁过期时间是分步两步执行的，这个时候，我们可以联想到什么：事务的原子性（Atom）。

原子性：多条命令要么都成功执行，要么都不执行。

将两步放在一步中执行：占锁+设置锁过期时间。

Redis 正好支持这种操作：

# 设置某个 key 的值并设置多少毫秒或秒 过期。
set <key> <value> PX <多少毫秒> NX
或
set <key> <value> EX <多少秒> NX

然后可以通过如下命令查看 key 的变化

ttl <key>

下面演示下如何设置 key 并设置过期时间。注意：执行命令之前需要先删除 key，可以通过客户端或命令删除。

# 设置 key=wukong，value=1111，过期时间=5000ms
set wukong 1111 PX 5000 NX
# 查看 key 的状态
ttl wukong

执行结果如下图所示：每运行一次 ttl 命令，就可以看到 wukong 的过期时间就会减少。最后会变为 -2（已过期）。

5.2 技术原理图

黄金方案和白银方案的不同之处：获取锁的时候，也需要设置锁的过期时间，这是一个原子操作，要么都成功执行，要么都不执行。如下图所示：

5.3 示例代码

设置 lock 的值等于 123，过期时间为 10 秒。如果 10 秒以后，lock 还存在，则清理 lock。

setIfAbsent("lock", "123", 10, TimeUnit.SECONDS);

5.4 黄金方案的缺陷

我们还是举生活中的例子来看下黄金方案的缺陷。

5.4.1 User A preempts the lock

User A preempts the lock first and sets the lock to automatically unlock after 10 seconds. The number is 123.
10 seconds later, A is still executing the task, and the lock is automatically opened.

5.4.2 User B seizes the lock

User B sees that the lock of the room is open, So the lock was preempted, the lock number was set to 123, and the expiration time was set to 10 seconds.
Because only one user is allowed to perform tasks in the room, user A and user B performed tasks resulting in a conflict.
User A completed the task after 15 s, while user B was still executing the task.
User A took the initiative to open the lock numbered 123.
User B is still executing the task and finds that the lock has been opened.
User B is very angry: I haven’t finished the task yet, how come the lock is opened?

5.4.3 User C seizes the lock

User B’s lock is taken by A After opening, A leaves the room and B is still performing the task.
User C seizes the lock, and C starts executing the task.
Because only one user is allowed to perform tasks in the room, there is a conflict between user B and user C performing tasks.

We can know from the above case that because the time required for user A to process the task is greater than the time for automatic lock cleaning (unlocking), so before the automatic unlocking Later, another user seized the lock. When user A completes the task, he will actively open the locks seized by other users.

Why are other people’s locks opened here? Because the lock numbers are all called "123", user A only recognizes the lock number and opens the lock when he sees the lock numbered "123". As a result, user B's lock is opened. , user B has not completed the task at this time, and of course he is angry.

6. Platinum Plan

6.1 Examples from real life

The defects of the above gold plan can also be easily solved. Wouldn't it be nice to set a different number for each lock~

As shown in the figure below, the lock preempted by B is blue, which is different from the green lock preempted by A. This way it won't be opened by A.

Made an animated picture for easy understanding:

The static picture is more high-definition, you can take a look:

6.2 Technical Schematic Diagram

Differences from the golden solution:

设置锁的过期时间时，还需要设置唯一编号。
主动删除锁的时候，需要判断锁的编号是否和设置的一致，如果一致，则认为是自己设置的锁，可以进行主动删除。

6.3 代码示例

// 1.生成唯一 id
String uuid = UUID.randomUUID().toString();
// 2. 抢占锁
Boolean lock = redisTemplate.opsForValue().setIfAbsent("lock", uuid, 10, TimeUnit.SECONDS);
if(lock) {
    System.out.println("抢占成功：" + uuid);
    // 3.抢占成功，执行业务
    List<TypeEntity> typeEntityListFromDb = getDataFromDB();
    // 4.获取当前锁的值
    String lockValue = redisTemplate.opsForValue().get("lock");
    // 5.如果锁的值和设置的值相等，则清理自己的锁
    if(uuid.equals(lockValue)) {
        System.out.println("清理锁：" + lockValue);
        redisTemplate.delete("lock");
    }
    return typeEntityListFromDb;
} else {
    System.out.println("抢占失败，等待锁释放");
    // 4.休眠一段时间
    sleep(100);
    // 5.抢占失败，等待锁释放
    return getTypeEntityListByRedisDistributedLock();
}

1.生成随机唯一 id，给锁加上唯一值。
2.抢占锁，并设置过期时间为 10 s，且锁具有随机唯一 id。
3.抢占成功，执行业务。
4.执行完业务后，获取当前锁的值。
5.如果锁的值和设置的值相等，则清理自己的锁。

6.4 铂金方案的缺陷

上面的方案看似很完美，但还是存在问题：第 4 步和第 5 步并不是原子性的。

时刻：0s。线程 A 抢占到了锁。
时刻：9.5s。线程 A 向 Redis 查询当前 key 的值。
时刻：10s。锁自动过期。
时刻：11s。线程 B 抢占到锁。
时刻：12s。线程 A 在查询途中耗时长，终于拿多锁的值。
时刻：13s。线程 A 还是拿自己设置的锁的值和返回的值进行比较，值是相等的，清理锁，但是这个锁其实是线程 B 抢占的锁。

那如何规避这个风险呢？钻石方案登场。

七、钻石方案

上面的线程 A 查询锁和删除锁的逻辑不是原子性的，所以将查询锁和删除锁这两步作为原子指令操作就可以了。

7.1 技术原理图

如下图所示，红色圈出来的部分是钻石方案的不同之处。用脚本进行删除，达到原子操作。

7.2 代码示例

那如何用脚本进行删除呢？

我们先来看一下这段 Redis 专属脚本：

if redis.call("get",KEYS[1]) == ARGV[1]
then
    return redis.call("del",KEYS[1])
else
    return 0
end

这段脚本和铂金方案的获取key，删除key的方式很像。先获取 KEYS[1] 的 value，判断 KEYS[1] 的 value 是否和 ARGV[1] 的值相等，如果相等，则删除 KEYS[1]。

那么这段脚本怎么在 Java 项目中执行呢？

分两步：先定义脚本；用 redisTemplate.execute 方法执行脚本。

// 脚本解锁
String script = "if redis.call(&#39;get&#39;,KEYS[1]) == ARGV[1] then return redis.call(&#39;del&#39;,KEYS[1]) else return 0 end";
redisTemplate.execute(new DefaultRedisScript<Long>(script, Long.class), Arrays.asList("lock"), uuid);

上面的代码中，KEYS[1] 对应“lock”，ARGV[1] 对应 “uuid”，含义就是如果 lock 的 value 等于 uuid 则删除 lock。

而这段 Redis 脚本是由 Redis 内嵌的 Lua 环境执行的，所以又称作 Lua 脚本。

那钻石方案是不是就完美了呢？有没有更好的方案呢？

下篇，我们再来介绍另外一种分布式锁的王者方案：Redisson。

8. Summary

This article introduces the problem of distributed lock through the problem of local lock. Then it introduces five distributed lock solutions, and explains the improvements of different solutions from shallow to deep.

From the continuous evolution of the above solutions, we know where abnormal situations may exist in the system and how to handle them better.

Draw inferences from one example, and this evolving thinking model can also be applied to other technologies.

The following summarizes the shortcomings and improvements of the above five solutions.

Bronze Solution:

Defect: The business code is abnormal or the server is down, and the logic of actively deleting the lock is not executed, resulting in death Lock.
Improvement: Set the automatic expiration time of the lock. After a period of time, the lock will be automatically deleted so that other threads can obtain the lock.

Silver Solution:

Defect: occupying the lock and setting the lock expiration time are executed in two steps. Not an atomic operation.
Improvement: occupying locks and setting lock expiration time ensure atomic operations.

Golden Solution:

Defect: When the lock is actively deleted, because the lock values are all the same, the Locks occupied by other clients are deleted.
Improvement: Each time the lock is occupied, it is randomly set to a larger value. When the lock is actively deleted, the value of the lock is compared with the value set by yourself to see whether it is equal.

Platinum Solution:

Defects: Obtaining the lock, comparing the value of the lock, and deleting the lock, these three steps are wrong Atomic. It is possible that the lock automatically expired midway and was seized by other clients, causing the locks occupied by other clients to be deleted when the lock was deleted.
Improvement: Use Lua scripts to perform atomic operations of acquiring locks, comparing locks, and deleting locks.

Diamond Plan：

Defects: Unprofessional distributed lock scheme.
Improvement: Redission distributed lock.

The King’s Plan, see you in the next article~

The above is the detailed content of Redis distributed lock｜Five evolution plans from bronze to diamond. For more information, please follow other related articles on the PHP Chinese website!

Statement

This article is reproduced at:Java后端技术全栈. If there is any infringement, please contact admin@php.cn delete