Home  >  Article  >  Database  >  Various implementation methods of distributed locks

Various implementation methods of distributed locks

大家讲道理
大家讲道理Original
2016-11-08 10:08:361799browse

At present, almost many large websites and applications are deployed in a distributed manner. The issue of data consistency in distributed scenarios has always been an important topic. The distributed CAP theory tells us that "any distributed system cannot satisfy consistency (Consistency), availability (Availability) and partition tolerance (Partition tolerance) at the same time. It can only satisfy two at the same time." Therefore, many systems are It is necessary to make a choice between these three at the beginning of the design. In most scenarios in the Internet field, strong consistency needs to be sacrificed in exchange for high system availability. The system often only needs to ensure "eventual consistency", as long as the final time is within the range acceptable to the user.

In many scenarios, in order to ensure the ultimate consistency of data, we need many technical solutions to support it, such as distributed transactions, distributed locks, etc. Sometimes, we need to ensure that a method can only be executed by the same thread at the same time. In a stand-alone environment, Java actually provides a lot of APIs related to concurrent processing, but these APIs are useless in distributed scenarios. In other words, the simple Java API cannot provide distributed lock capabilities. Therefore, there are currently many solutions for the implementation of distributed locks.

For the implementation of distributed locks, the following solutions are currently commonly used:

Distributed locks are implemented based on databases Distributed locks are implemented based on cache (redis, memcached, tair) Distributed locks are implemented based on Zookeeper

In analysis Before these implementation solutions, let’s first think about what the distributed lock we need should look like? (Method lock is used as an example here, resource lock is the same)

  • can ensure that in a distributed deployment application cluster, the same method can only be executed by one thread on one machine at the same time.

  • If this lock is a reentrant lock (to avoid deadlock)

  • It is best if this lock is a blocking lock (consider whether to use this according to business needs)

  • Have high availability The function of acquiring locks and releasing locks

  • The performance of acquiring locks and releasing locks is better

Implementing distributed locks based on database

Based on database tables

To implement distributed locks, the simplest way may be Directly create a lock table and then operate the data in the table.

When we want to lock a method or resource, we add a record to the table, and delete the record when we want to release the lock.

Create a database table like this:

CREATE TABLE `methodLock` (
  `id` int(11) NOT NULL AUTO_INCREMENT COMMENT '主键',
  `method_name` varchar(64) NOT NULL DEFAULT '' COMMENT '锁定的方法名',
  `desc` varchar(1024) NOT NULL DEFAULT '备注信息',
  `update_time` timestamp NOT NULL DEFAULT CURRENT_TIMESTAMP ON UPDATE CURRENT_TIMESTAMP COMMENT '保存数据时间,自动生成',
  PRIMARY KEY (`id`),
  UNIQUE KEY `uidx_method_name` (`method_name `) USING BTREE
) ENGINE=InnoDB DEFAULT CHARSET=utf8 COMMENT='锁定中的方法';

When we want to lock a method, execute the following SQL:

insert into methodLock(method_name,desc) values (‘method_name’,‘desc’)

Because we have unique constraints on method_name, if there are multiple requests submitted to the database at the same time, If so, the database will ensure that only one operation can succeed, then we can think that the thread with a successful operation has obtained the lock of the method and can execute the content of the method body.

After the method is executed, if you want to release the lock, you need to execute the following Sql:

delete from methodLock where method_name ='method_name'

The above simple implementation has the following problems:

1. This lock strongly depends on the availability of the database. The database is a Single point, once the database hangs up, the business system will become unavailable.

2. This lock has no expiration time. Once the unlocking operation fails, the lock record will remain in the database and other threads will no longer be able to obtain the lock.

3. This lock can only be non-blocking, because if the data insert operation fails, an error will be reported directly. Threads that have not acquired the lock will not enter the queue. If you want to acquire the lock again, you must trigger the lock acquisition operation again.

4. This lock is non-reentrant. The same thread cannot obtain the lock again before releasing the lock. Because the data in the data already exists.

Of course, we can also have other ways to solve the above problems.

Is the database a single point? Create two databases and synchronize data in both directions. Once it fails, quickly switch to the standby database.

No expiration time? Just do a scheduled task and clean up the timeout data in the database at certain intervals.

Non-blocking? Create a while loop until the insert is successful and then return success.

Non-reentrant? Add a field to the database table to record the host information and thread information of the machine that currently obtains the lock. Then the next time you obtain the lock, query the database first. If the host information and thread information of the current machine can be found in the database, directly Just assign the lock to him.

Based on database exclusive lock

In addition to adding and deleting records in the data table, you can also use the built-in locks in the data to implement distributed locks.

We still use the database table we just created. Distributed locks can be implemented through exclusive locks on the database. The InnoDB engine based on MySql can use the following methods to implement locking operations:

public boolean lock(){
    connection.setAutoCommit(false)
    while(true){
        try{
            result = select * from methodLock where method_name=xxx for update;
            if(result==null){
                return true;
            }
        }catch(Exception e){
        }
        sleep(1000);
    }
    return false;
}

Add for update after the query statement, and the database will add an exclusive lock to the database table during the query process. When an exclusive lock is added to a record, other threads can no longer add exclusive locks to the record.

我们可以认为获得排它锁的线程即可获得分布式锁,当获取到锁之后,可以执行方法的业务逻辑,执行完方法之后,再通过以下方法解锁:

public void unlock(){
    connection.commit();
}

通过connection.commit()操作来释放锁。

这种方法可以有效的解决上面提到的无法释放锁和阻塞锁的问题。

阻塞锁? for update语句会在执行成功后立即返回,在执行失败时一直处于阻塞状态,直到成功。

锁定之后服务宕机,无法释放?使用这种方式,服务宕机之后数据库会自己把锁释放掉。

但是还是无法直接解决数据库单点和可重入问题。

总结

总结一下使用数据库来实现分布式锁的方式,这两种方式都是依赖数据库的一张表,一种是通过表中的记录的存在情况确定当前是否有锁存在,另外一种是通过数据库的排他锁来实现分布式锁。

数据库实现分布式锁的优点

直接借助数据库,容易理解。

数据库实现分布式锁的缺点

会有各种各样的问题,在解决问题的过程中会使整个方案变得越来越复杂。

操作数据库需要一定的开销,性能问题需要考虑。

基于缓存实现分布式锁

相比较于基于数据库实现分布式锁的方案来说,基于缓存来实现在性能方面会表现的更好一点。而且很多缓存是可以集群部署的,可以解决单点问题。

目前有很多成熟的缓存产品,包括Redis,memcached以及我们公司内部的Tair。

这里以Tair为例来分析下使用缓存实现分布式锁的方案。关于Redis和memcached在网络上有很多相关的文章,并且也有一些成熟的框架及算法可以直接使用。

基于Tair的实现分布式锁在内网中有很多相关文章,其中主要的实现方式是使用TairManager.put方法来实现。

public boolean trylock(String key) {
    ResultCode code = ldbTairManager.put(NAMESPACE, key, "This is a Lock.", 2, 0);
    if (ResultCode.SUCCESS.equals(code))
        return true;
    else
        return false;
}
public boolean unlock(String key) {
    ldbTairManager.invalid(NAMESPACE, key);
}

以上实现方式同样存在几个问题:

1、这把锁没有失效时间,一旦解锁操作失败,就会导致锁记录一直在tair中,其他线程无法再获得到锁。

2、这把锁只能是非阻塞的,无论成功还是失败都直接返回。

3、这把锁是非重入的,一个线程获得锁之后,在释放锁之前,无法再次获得该锁,因为使用到的key在tair中已经存在。无法再执行put操作。

当然,同样有方式可以解决。

没有失效时间?tair的put方法支持传入失效时间,到达时间之后数据会自动删除。

非阻塞?while重复执行。

非可重入?在一个线程获取到锁之后,把当前主机信息和线程信息保存起来,下次再获取之前先检查自己是不是当前锁的拥有者。

但是,失效时间我设置多长时间为好?如何设置的失效时间太短,方法没等执行完,锁就自动释放了,那么就会产生并发问题。如果设置的时间太长,其他获取锁的线程就可能要平白的多等一段时间。这个问题使用数据库实现分布式锁同样存在

总结

可以使用缓存来代替数据库来实现分布式锁,这个可以提供更好的性能,同时,很多缓存服务都是集群部署的,可以避免单点问题。并且很多缓存服务都提供了可以用来实现分布式锁的方法,比如Tair的put方法,redis的setnx方法等。并且,这些缓存服务也都提供了对数据的过期自动删除的支持,可以直接设置超时时间来控制锁的释放。

使用缓存实现分布式锁的优点

性能好,实现起来较为方便。

使用缓存实现分布式锁的缺点

通过超时时间来控制锁的失效时间并不是十分的靠谱。

基于Zookeeper实现分布式锁

基于zookeeper临时有序节点可以实现的分布式锁。

大致思想即为:每个客户端对某个方法加锁时,在zookeeper上的与该方法对应的指定节点的目录下,生成一个唯一的瞬时有序节点。 判断是否获取锁的方式很简单,只需要判断有序节点中序号最小的一个。 当释放锁的时候,只需将这个瞬时节点删除即可。同时,其可以避免服务宕机导致的锁无法释放,而产生的死锁问题。

来看下Zookeeper能不能解决前面提到的问题。

锁无法释放?使用Zookeeper可以有效的解决锁无法释放的问题,因为在创建锁的时候,客户端会在ZK中创建一个临时节点,一旦客户端获取到锁之后突然挂掉(Session连接断开),那么这个临时节点就会自动删除掉。其他客户端就可以再次获得锁。

Non-blocking lock? Blocking locks can be implemented using Zookeeper. The client can create sequential nodes in ZK and bind listeners to the nodes. Once the node changes, Zookeeper will notify the client, and the client can check whether the node it created is current. The one with the smallest sequence number among all nodes. If it is, then you have obtained the lock and can execute the business logic.

No re-entry? Using Zookeeper can also effectively solve the problem of non-reentrancy. When the client creates a node, it directly writes the host information and thread information of the current client into the node. The next time it wants to acquire the lock, it will be the smallest node at present. Just compare the data in . If it is the same as your own information, then you will directly obtain the lock. If it is different, you will create a temporary sequence node and participate in the queue.

Single question? Using Zookeeper can effectively solve single point problems. ZK is deployed in a cluster. As long as more than half of the machines in the cluster survive, it can provide external services.

You can directly use the zookeeper third-party library Curator client, which encapsulates a reentrant lock service.

public boolean tryLock(long timeout, TimeUnit unit) throws InterruptedException {
try {
return interProcessMutex.acquire(timeout, unit);
} catch (Exception e) {
e.printStackTrace();
}
return true;
}
public boolean unlock() {
try {
} interProcessMutex.release();
} catch (Throwable e) {
log.error(e.getMessage(), e);
} finally {
executorService.sched ule( new Cleaner(client, path), delayTimeForClean, TimeUnit.MILLISECONDS);
}
return true;
}

InterProcessMutex provided by Curator is the implementation of distributed lock. The acquire method user acquires the lock, and the release method is used to release the lock.

The distributed lock implemented using ZK seems to fully meet all our expectations for a distributed lock at the beginning of this article. However, it is not actually the case. The distributed lock implemented by Zookeeper actually has a shortcoming, that is, the performance may not be as high as the cache service. Because every time during the process of creating and releasing a lock, transient nodes must be dynamically created and destroyed to implement the lock function. Creating and deleting nodes in ZK can only be performed through the Leader server, and then the data is not shared on all Follower machines.

Summary

The advantages of using Zookeeper to implement distributed locks

Effectively solve single point problems, non-reentrant problems, non-blocking problems and problems where locks cannot be released. It is relatively simple to implement.

Disadvantages of using Zookeeper to implement distributed locks

The performance is not as good as using cache to implement distributed locks. You need to understand the principles of ZK.

Comparison of three solutions

From the perspective of ease of understanding (from low to high)

Database > Cache > Zookeeper

From the perspective of implementation complexity (from low to high)

Zookeeper >= Cache > Database

From a performance perspective (from high to low)

Cache > Zookeeper >= Database

From a reliability perspective (from high to low)

Zookeeper > Cache > Database


Statement:
The content of this article is voluntarily contributed by netizens, and the copyright belongs to the original author. This site does not assume corresponding legal responsibility. If you find any content suspected of plagiarism or infringement, please contact admin@php.cn