Home >Java >javaTutorial >[Fighting Java Concurrency]-----In-depth analysis of the implementation principle of synchronized

[Fighting Java Concurrency]-----In-depth analysis of the implementation principle of synchronized

黄舟
黄舟Original
2017-02-24 09:58:501126browse

I remember when I first started learning Java, synchronized was used when encountering multi-threading situations. Compared to us at that time, synchronized was so magical and powerful. At that time, we gave it a name "synchronized", which also became Our tried and tested solution to multi-threaded situations. However, as our study progresses, we know that synchronized is a heavyweight lock. Compared with Lock, it will appear so cumbersome that we think it is not so efficient and slowly abandon it.

It is true that with the various optimizations of synchronized in Javs SE 1.6, synchronized will not appear so heavy. Let's follow LZ to explore the implementation mechanism of synchronized, how Java optimizes it, the lock optimization mechanism, the lock storage structure and the upgrade process;

Implementation principle

synchronized It can ensure that when a method or code block is running, only one method can enter the critical section at the same time, and it can also ensure the memory visibility of shared variables

Every object in Java can be used as a lock , this is the basis for synchronized implementation of synchronization:
1. Ordinary synchronization method, the lock is the current instance object
2. Static synchronization method, the lock is the class object of the current class
3. Synchronization method block, the lock is Objects in brackets

When a thread accesses a synchronized code block, it first needs to obtain a lock to execute the synchronized code. When it exits or throws an exception, it must release the lock. So how does it implement this mechanism? What about? Let’s look at a simple code first:

public class SynchronizedTest {
    public synchronized void test1(){

    }    public void test2(){        synchronized (this){

        }
    }
}

Use the javap tool to view the generated class file information to analyze the implementation of Synchronize
[Fighting Java Concurrency]-----In-depth analysis of the implementation principle of synchronized
As can be seen from the above, the synchronization code block uses monitorenter Implemented with the monitorexit instruction, the synchronization method (it is not obvious here that you need to look at the underlying implementation of the JVM) relies on the ACC_SYNCHRONIZED implementation on the method modifier.
Synchronized code block: The monitorenter instruction is inserted at the beginning of the synchronized code block, and the monitorexit instruction is inserted at the end of the synchronized code block. The JVM needs to ensure that each monitorenter has a monitorexit corresponding to it. Any object has a monitor associated with it. When a monitor is held, it will be in a locked state. When the thread executes the monitorenter instruction, it will try to obtain the monitor ownership corresponding to the object, that is, try to obtain the object's lock;
Synchronized method: The synchronized method will be translated into a normal method call and return Instructions such as: invokevirtual and areturn instructions do not have any special instructions at the VM bytecode level to implement the method modified by synchronized. Instead, the synchronized flag position in the access_flags field of the method is set to 1 in the method table of the Class file. Indicates that the method is a synchronized method and uses the object that calls the method or the Class to which the method belongs to represent Klass as the lock object in the JVM's internal object. (Excerpted from: http://www.php.cn/)

Let’s continue the analysis, but before we go deeper, we need to understand two important concepts: Java object header and Monitor.

Java object header, monitor

Java object header and monitor are the basis for realizing synchronized! These two concepts will be introduced in detail below.

Java Object Head

The lock used by synchronized is stored in the Java object header. So what is the Java object header? The object header of the Hotspot virtual machine mainly includes two parts of data: Mark Word (mark field) and Klass Pointer (type pointer). Among them, Klass Point is the pointer of the object to its class metadata. The virtual machine uses this pointer to determine which class the object is an instance of. Mark Word is used to store the runtime data of the object itself. It implements lightweight locks and The key to bias locking, so the following will focus on

Mark Word.
Mark Word is used to store the runtime data of the object itself, such as hash code (HashCode), GC generation age, lock status flag, lock held by the thread, biased thread ID, biased timestamp, etc. Java object headers generally occupy two machine codes (in a 32-bit virtual machine, 1 machine code is equal to 4 bytes, which is 32bit), but if the object is an array type, three machine codes are needed, because the JVM virtual machine can The size of the Java object is determined through the metadata information of the Java object, but the size of the array cannot be confirmed from the metadata of the array, so a block is used to record the array length. The following figure is the storage structure of the Java object header (32-bit virtual machine):
[Fighting Java Concurrency]-----In-depth analysis of the implementation principle of synchronized
The object header information is an additional storage cost that has nothing to do with the data defined by the object itself, but considering the space efficiency of the virtual machine, Mark Word is designed as a non-fixed data structure to store as much data as possible in a very small memory space. It will reuse its own storage space according to the state of the object. In other words, Mark Word will change as the program runs. Change, the change status is as follows (32-bit virtual machine):
[Fighting Java Concurrency]-----In-depth analysis of the implementation principle of synchronized

A brief introduction to the Java object header, let's look at the Monitor next.

Monitor

What is Monitor? We can understand it as a synchronization tool, or it can be described as a synchronization mechanism. It is usually described as an object.
Just like everything is an object, all Java objects are born Monitors. Every Java object has the potential to become a Monitor, because in the design of Java, every Java object comes out of the womb with a handful of monitors. The missing lock is called an internal lock or Monitor lock.
Monitor is a thread-private data structure. Each thread has an available monitor record list, and there is also a global available list. Each locked object is associated with a monitor (the LockWord in the MarkWord of the object header points to the starting address of the monitor). At the same time, there is an Owner field in the monitor that stores the unique identifier of the thread that owns the lock, indicating that the lock is owned by this Thread occupied. Its structure is as follows:
[Fighting Java Concurrency]-----In-depth analysis of the implementation principle of synchronized
Owner: Initially NULL means that no thread currently owns the monitor record. When the thread successfully owns the lock, the unique identifier of the thread is saved. When the lock is It is set to NULL when released;
EntryQ: associates a system mutex (semaphore) and blocks all threads that fail to lock the monitor record.
RcThis: Indicates the number of all threads blocked or waiting on the monitor record.
Nest: Used to implement reentrancy lock counting.
HashCode: Save the HashCode value copied from the object header (may also include GC age).
Candidate: Used to avoid unnecessary blocking or waiting for threads to wake up, because only one thread can successfully own the lock at a time. If the previous thread that releases the lock wakes up all the threads that are blocking or waiting. Threads will cause unnecessary context switching (from blocked to ready and then blocked again due to failure of competing locks), resulting in serious performance degradation. Candidate has only two possible values: 0 means that there is no thread that needs to be awakened; 1 means that a successor thread needs to be awakened to compete for the lock.
Excerpted from: Implementation principles and applications of synchronized in Java)
We know that synchronized is a heavyweight lock and is not very efficient. At the same time, this concept has always been in our minds, but the implementation of synchronized in jdk 1.6 has been revised. Various optimizations have been made to make it not so heavy. So what optimization methods has the JVM adopted?

Lock optimization

jdk1.6 has introduced a large number of optimizations to the implementation of locks, such as spin locks, adaptive spin locks, lock elimination, lock coarsening, biased locks, and lightweight Techniques such as level locking are used to reduce the overhead of lock operations.
Locks mainly exist in four states, which are: no lock state, biased lock state, lightweight lock state, and heavyweight lock state. They will gradually upgrade with the fierce competition. Note that locks can be upgraded but not downgraded. This strategy is to improve the efficiency of acquiring and releasing locks.

自旋锁

线程的阻塞和唤醒需要CPU从用户态转为核心态,频繁的阻塞和唤醒对CPU来说是一件负担很重的工作,势必会给系统的并发性能带来很大的压力。同时我们发现在许多应用上面,对象锁的锁状态只会持续很短一段时间,为了这一段很短的时间频繁地阻塞和唤醒线程是非常不值得的。所以引入自旋锁。
何谓自旋锁?
所谓自旋锁,就是让该线程等待一段时间,不会被立即挂起,看持有锁的线程是否会很快释放锁。怎么等待呢?执行一段无意义的循环即可(自旋)。
自旋等待不能替代阻塞,先不说对处理器数量的要求(多核,貌似现在没有单核的处理器了),虽然它可以避免线程切换带来的开销,但是它占用了处理器的时间。如果持有锁的线程很快就释放了锁,那么自旋的效率就非常好,反之,自旋的线程就会白白消耗掉处理的资源,它不会做任何有意义的工作,典型的占着茅坑不拉屎,这样反而会带来性能上的浪费。所以说,自旋等待的时间(自旋的次数)必须要有一个限度,如果自旋超过了定义的时间仍然没有获取到锁,则应该被挂起。
自旋锁在JDK 1.4.2中引入,默认关闭,但是可以使用-XX:+UseSpinning开开启,在JDK1.6中默认开启。同时自旋的默认次数为10次,可以通过参数-XX:PreBlockSpin来调整;
如果通过参数-XX:preBlockSpin来调整自旋锁的自旋次数,会带来诸多不便。假如我将参数调整为10,但是系统很多线程都是等你刚刚退出的时候就释放了锁(假如你多自旋一两次就可以获取锁),你是不是很尴尬。于是JDK1.6引入自适应的自旋锁,让虚拟机会变得越来越聪明。

适应自旋锁

JDK 1.6引入了更加聪明的自旋锁,即自适应自旋锁。所谓自适应就意味着自旋的次数不再是固定的,它是由前一次在同一个锁上的自旋时间及锁的拥有者的状态来决定。它怎么做呢?线程如果自旋成功了,那么下次自旋的次数会更加多,因为虚拟机认为既然上次成功了,那么此次自旋也很有可能会再次成功,那么它就会允许自旋等待持续的次数更多。反之,如果对于某个锁,很少有自旋能够成功的,那么在以后要或者这个锁的时候自旋的次数会减少甚至省略掉自旋过程,以免浪费处理器资源。
有了自适应自旋锁,随着程序运行和性能监控信息的不断完善,虚拟机对程序锁的状况预测会越来越准确,虚拟机会变得越来越聪明。

锁消除

为了保证数据的完整性,我们在进行操作时需要对这部分操作进行同步控制,但是在有些情况下,JVM检测到不可能存在共享数据竞争,这是JVM会对这些同步锁进行锁消除。锁消除的依据是逃逸分析的数据支持。
如果不存在竞争,为什么还需要加锁呢?所以锁消除可以节省毫无意义的请求锁的时间。变量是否逃逸,对于虚拟机来说需要使用数据流分析来确定,但是对于我们程序员来说这还不清楚么?我们会在明明知道不存在数据竞争的代码块前加上同步吗?但是有时候程序并不是我们所想的那样?我们虽然没有显示使用锁,但是我们在使用一些JDK的内置API时,如StringBuffer、Vector、HashTable等,这个时候会存在隐形的加锁操作。比如StringBuffer的append()方法,Vector的add()方法:

    public void vectorTest(){
        Vector<String> vector = new Vector<String>();        for(int i = 0 ; i < 10 ; i++){
            vector.add(i + "");
        }

        System.out.println(vector);
    }

在运行这段代码时,JVM可以明显检测到变量vector没有逃逸出方法vectorTest()之外,所以JVM可以大胆地将vector内部的加锁操作消除。

锁粗化

我们知道在使用同步锁的时候,需要让同步块的作用范围尽可能小—仅在共享数据的实际作用域中才进行同步,这样做的目的是为了使需要同步的操作数量尽可能缩小,如果存在锁竞争,那么等待锁的线程也能尽快拿到锁。
在大多数的情况下,上述观点是正确的,LZ也一直坚持着这个观点。但是如果一系列的连续加锁解锁操作,可能会导致不必要的性能损耗,所以引入锁粗话的概念。
锁粗话概念比较好理解,就是将多个连续的加锁、解锁操作连接在一起,扩展成一个范围更大的锁。如上面实例:vector每次add的时候都需要加锁操作,JVM检测到对同一个对象(vector)连续加锁、解锁操作,会合并一个更大范围的加锁、解锁操作,即加锁解锁操作会移到for循环之外。

Lightweight Lock

The main purpose of introducing lightweight locks is to reduce the performance consumption caused by traditional heavyweight locks using operating system mutexes without multi-thread competition. When the bias lock function is turned off or multiple threads compete for the bias lock and the bias lock is upgraded to a lightweight lock, an attempt will be made to acquire the lightweight lock. The steps are as follows:
Get the lock
1 . Determine whether the current object is in a lock-free state (hashcode, 0, 01). If so, the JVM will first create a space named Lock Record in the stack frame of the current thread to store the current status of the lock object. A copy of the Mark Word (the official adds a Displaced prefix to this copy, that is, Displaced Mark Word); otherwise, perform step (3);
2. The JVM uses the CAS operation to try to update the object's Mark Word to point to the Lock Record Correction, if it succeeds, it means that the lock is competed for, then the lock flag will be changed to 00 (indicating that this object is in a lightweight lock state), and the synchronization operation will be performed; if it fails, step (3) will be performed;
3. Determine the current object Whether the Mark Word points to the stack frame of the current thread, if so, it means that the current thread already holds the lock of the current object, and the synchronization code block will be executed directly; otherwise, it only means that the lock object has been preempted by other threads, and it is lightweight at this time. The level lock needs to be expanded into a heavyweight lock, the lock flag becomes 10, and the thread waiting later will enter the blocking state;

Release the lock
The release of the lightweight lock is also It is carried out through CAS operation. The main steps are as follows:
1. Get the data saved in the Displaced Mark Word after obtaining the lightweight lock;
2. Use the CAS operation to replace the Mark Word of the current object with the fetched data. , if successful, it means that the lock is released successfully, otherwise execute (3);
3. If the CAS operation replacement fails, it means that other threads try to acquire the lock, and you need to wake up the suspended thread while releasing the lock. thread.

For lightweight locks, the basis for performance improvement is "for most locks, there will be no competition during the entire life cycle." If this basis is broken, in addition to the overhead of mutual exclusion, In addition, there are additional CAS operations, so in the case of multi-thread competition, lightweight locks are slower than heavyweight locks;


The following figure shows the acquisition and release of lightweight locks Process
[Fighting Java Concurrency]-----In-depth analysis of the implementation principle of synchronized

Biased lock

The main purpose of introducing biased lock is to minimize unnecessary lightweight lock execution paths without multi-thread competition. As mentioned above, the locking and unlocking operations of lightweight locks require multiple CAS atomic instructions. So how does biased locking reduce unnecessary CAS operations? We can understand it by looking at the structure of Mark work. You only need to check whether it is a biased lock, the lock identification is and ThreadID. The processing flow is as follows:
Get the lock
1. Check whether the Mark Word is in a biasable state, that is, whether it is a biased lock. 1. The lock flag is 01;
1. If it is in the biasable state, test whether the thread ID is the current thread ID. If so, perform step (5), otherwise perform step (3);
1 . If the thread ID is not the current thread ID, compete for the lock through CAS operation. If the competition is successful, replace the thread ID of Mark Word with the current thread ID, otherwise execute thread (4);
4. Failed to compete for the lock through CAS , proving that there is currently a multi-thread competition situation. When the global safe point is reached, the thread that obtained the biased lock is suspended, the biased lock is upgraded to a lightweight lock, and then the thread blocked at the safe point continues to execute the synchronization code block;
5. Execute synchronized code block

Release lock
The release of bias lock adopts a mechanism that only competition will release the lock. The thread will not take the initiative to release the bias. The lock needs to wait for other threads to compete. The revocation of the biased lock needs to wait for the global safety point (this time point is when there is no executing code). The steps are as follows:
1. Pause the thread that owns the biased lock and determine whether the lock object is still locked;
2. Cancel the biased lock and return to the lock-free state (01) or lightweight lock. Status;


The following figure is the acquisition and release process of biased lock
[Fighting Java Concurrency]-----In-depth analysis of the implementation principle of synchronized

Heavyweight lock

Heavyweight lock is monitored inside the object Monitor implementation, in which the essence of monitor relies on the Mutex Lock implementation of the underlying operating system. Switching between threads in the operating system requires switching from user mode to kernel mode, and the switching cost is very high.

The above is [Fighting Java Concurrency]-----an in-depth analysis of the implementation principles of synchronized. For more related content, please pay attention to the PHP Chinese website (www.php.cn)!


Statement:
The content of this article is voluntarily contributed by netizens, and the copyright belongs to the original author. This site does not assume corresponding legal responsibility. If you find any content suspected of plagiarism or infringement, please contact admin@php.cn
Previous article:java genericsNext article:java generics