Home >Java >javaTutorial >Did you use the right lock? A brief discussion on Java 'lock” matters
In every era, people who can learn will not be treated badly
Recently, I found in the company that new colleagues have some misunderstandings about locks, so today I will Let’s talk about “locks” and what to pay attention to when using concurrent security containers in Java.
But before that, we still have to explain why we need to lock this thing. We have to start from the source of the concurrency bug.
I wrote an article about this issue in 2019. Looking back at that article now, it’s really So shy.
Let us take a look at what this source is. We know that a computer has a CPU, memory, and a hard disk. The reading speed of the hard disk is the slowest, followed by the reading speed of the memory. Memory reading is too slow compared to CPU operation, so a CPU cache is built, L1, L2, and L3.
It is this CPU cache coupled with the current situation of multi-core CPUs that produces concurrency BUG.
This is a very simple code. If there are thread A and thread B executing this method in CPU-A and CPU-B respectively, their operations are First, a is accessed from the main cache to the respective caches of the CPUs. At this time, the values of a in their caches are all 0.
Then they execute a separately. At this time, the value of a in their respective eyes is 1. Afterwards, when a is flushed to the main memory, the value of a is still 1. This is a problem. It is obviously executed twice. The final result of adding one is 1, not 2.
This problem is called the visibility problem.
Looking at our statement a, our current languages are all high-level languages. This is actually very similar to syntactic sugar. It seems to be very convenient to use. In fact, it is just the surface. There is not a single instruction that needs to be executed. .
A statement in a high-level language can be translated into more than one CPU instruction. For example, a can be translated into at least three CPU instructions.
Get a from memory to register;
In register 1;
Write the result to the cache or memory;
So we think that this statement a is impossible to interrupt because it has Atomic, but in fact the CPU may execute an instruction when the time slice is up. At this time, the context switches to another thread, which also executes a. When you switch back again, the value of a is actually wrong.
This problem is called the atomicity problem.
And in order to optimize performance, the compiler or interpreter may change the execution order of statements. This is called instruction rearrangement. The most classic example is the double check of singleton mode. In order to improve execution efficiency, the CPU will execute out of order. For example, when the CPU is waiting for memory data to be loaded, it finds that the following addition instruction does not depend on the calculation result of the previous instruction, so it executes the addition instruction first.
This problem is called the ordering problem.
So far we have analyzed the sources of concurrency bugs, namely these three major issues. It can be seen that whether it is CPU cache, multi-core CPU, high-level language or out-of-order rearrangement, it is actually necessary, so we can only face these problems head-on.
To solve these problems is to disable caching, prohibit compiler instruction rearrangement, mutual exclusion, etc. Our topic today is related to mutual exclusion.
Mutual exclusion is to ensure that modifications to shared variables are mutually exclusive, that is, only one thread is executing at the same time. When it comes to mutual exclusion, I believe that what comes to everyone's mind is lock. Yes, our topic today is locks! Locks are designed to solve the atomicity problem.
When it comes to locks, the first reaction of Java students is the synchronized keyword. After all, it is supported at the language level. Let’s take a look at synchronized first. Some students don’t understand synchronized well, so there are many pitfalls in using it.
Let’s first look at a code. This code is our way to increase wages. In the end, millions will be paid off. . And a thread always compares whether our wages are equal. Let me briefly talk about IntStream.rangeClosed(1,1000000).forEach
. Some people may not be familiar with this. This code is equivalent to a for loop 1 million times.
Please understand it yourself first and see if there are any problems? The first reaction seems to be fine. When you look at the salary increase, it is executed in one thread. This does not modify the value of the salary. It seems that there is nothing wrong? There is no competition for concurrent resources, and it is decorated with volatile to ensure visibility.
Let's take a look at the results, I took a screenshot.
You can see that first of all, the log is printed incorrectly, secondly, the values printed are still equal! Did anything surprise you? Some students may subconsciously think that raiseSalary
is being modified, so it must be a thread safety issue and add a lock to raiseSalary
!
Please note that only one thread is calling the raiseSalary
method, so it is useless to lock the raiseSalary
method alone.
This is actually the atomicity problem I mentioned above. Imagine that after the salary increase thread has finished executing yesSalary
but has not yet executed yourSalary
, the salary thread has just executed yesSalary != yourSalary
Is it definitely true? That's why the log is printed.
Furthermore, because volatile modification ensures visibility, when logging, yourSalary
may have been executed, and the log output at this time will be yesSalary = = yourSalary
.
So the simplest solution is to modify both raiseSalary()
and compareSalary()
with synchronized, so that the two threads of salary increase and salary comparison will not Executed at the same time, so it is definitely safe!
It seems that the lock is quite simple, but the use of synchronized is still a pitfall for novices, that is, you have to pay attention to what synchronized locks.
For example, I changed to multi-threading to increase my salary. Let me mention parallel
again. This actually uses the ForkJoinPool thread pool operation. The default number of threads is the number of CPU cores.
Because raiseSalary()
adds a lock, the final result is correct. This is because synchronized modifies the yesLockDemo
instance. There is only one instance in our main, so multi-threads compete for one lock, so the final calculated data is correct.
Then I will modify the code so that each thread has its own yesLockDemo instance to increase the salary.
You will find out why this lock is useless? The promised annual salary of one million will be changed to 100,000? ? Fortunately, you still have 70w.
This is because our lock modifies a non-static method at this time, which is an instance-level lock, and we have created an instance for each thread, so these threads compete It's not a lock at all, and the correct code for multi-thread calculation above is because each thread uses the same instance, so it competes for a lock. If you want the code at this time to be correct, you only need to change the instance-level lock to a class-level lock.
It's very simple. Just turn this method into a static method.synchronized Modifying the static method is a class-level lock.
Another way is to declare a static variable, which is more recommended, because turning a non-static method into a static method is actually equivalent to changing the code structure. Let’s summarize. When using synchronized, you need to pay attention to what the lock is.If you modify static fields and static methods, it is a class-level lock. If Modifying non-static fields and non-static methods is instance-level lock.
I believe everyone knows that Hashtable is not recommended. If you want to use it, use ConcurrentHashMap because although Hashtable is thread-safe, it So gross, it puts the same lock on all methods! Let's take a look at the source code.
What do you think this contains has to do with the size method? Why am I not allowed to adjust the size when I call contains? This is because the lock granularity is too coarse. We need to evaluate it. Different methods use different locks, so as to improve the concurrency under thread safety.
But different locks for different methods are not enough, because sometimes some operations in a method are actually thread-safe, Only the code that involves competing resources needs to be locked. Especially if the code that does not require a lock is very time-consuming, it will occupy the lock for a long time, and other threads can only wait in line, such as the following code.
Obviously the second piece of code is the normal way to use the lock, but in the usual business code, it is not as easy as the sleep posted in my code. As you can see, sometimes it is necessary to modify the order of code execution, etc. to ensure that the granularity of the lock is fine enough.
Sometimes we need to ensure that the lock is thick enough, but this part of the JVM will be detected and it will help us optimize it, such as the following code.
You can see that the logic called in a method has gone through lock-execute A-unlock-lock-execute B-unlock
, it is obvious that we only need to go through locking-executing A-executing B-unlocking
.
So the JVM will coarsen the lock during just-in-time compilation and expand the scope of the lock, similar to the following situation.
And the JVM will also lock elimination action, through escape analysis to determine that the instance object is thread private, then it must be thread safe , so the locking action in the object will be ignored and called directly.
Read-write lock is what we submitted above to reduce the granularity of the lock according to the scenario. Put a lock It is split into a read lock and a write lock, which is particularly suitable for use when reading more and writing less, such as a cache implemented by oneself.
Allows multiple threads to read the share at the same time Variable , but write operations are mutually exclusive, that is, writing and reading are mutually exclusive. To put it bluntly, when writing, only one thread can write, and other threads cannot read or write.
Let’s take a look at a small example, which also has a small detail. This code is to simulate the reading of the cache. First, the read lock is used to get the data from the cache. If there is no data in the cache, the read lock is released. Then the write lock is used to get the data from the database, and then the data is stuffed into the cache and returned. The small detail here is to judge again whetherdata = getFromCache() has a value, because there may be multiple threads calling at the same time
getData(), then the cache is empty, so they all compete for the write lock. In the end, only one thread will get the write lock first, and then stuff the data into the cache.
Of course, everyone knows the usage paradigm of Lock. You need to use try-finally
to ensure that it will be unlocked. There is another important point to note about read-write locks, which is that locks cannot be upgraded. What does that mean? Let me change the code above.
But the read lock can be used again in the write lock to achieve the lock downgrade. Some people may ask if the write lock has been added. What do you want a read lock?
It is still somewhat useful. For example, a thread grabs the write lock, adds a read lock when the writing action is about to be completed, and then releases the write lock. At this time, it still holds the read lock. It is guaranteed that the data after the write lock operation can be used immediately, and other threads can also read the data because the write lock is gone at this time.
In fact, there is no need for a more overbearing lock such as a write lock! So let’s downgrade it so that everyone can read it.
To summarize, read-write lock is suitable for situations where there is more reading and less writing. It cannot be upgraded, but it can be downgraded. Lock's lock needs to cooperate with try-finally
to ensure that it will be unlocked.
By the way, let me mention a little moreThe implementation of read-write lock. Students who are familiar with AQS may know the state inside. The read-write lock divides the int type state into two Half, the upper 16 bits and the lower 16 bits record the status of the read lock and write lock respectively. The difference between it and an ordinary mutex lock is that these two states must be maintained and the two locks must be handled differently at the waiting queue.
SoIn scenarios that are not suitable for read-write locks, it is better to use mutex locks directly, because read-write locks also need to perform displacement judgment on the state and other operations.
I will also mention this a little bit, it was proposed in 1.8 The appearance rate does not seem to be as high as ReentrantReadWriteLock. It supports write locks, pessimistic read locks and optimistic reads. Write locks and pessimistic read locks are actually the same as the read-write locks in ReentrantReadWriteLock, which has an additional optimistic read.
From the above analysis, we know that the read-write lock cannot actually write when reading, and the optimistic read of StampedLock allows one thread to write . Optimistic reading is actually the same as the database optimistic locking we know. The optimistic locking of the database is judged by a version field, such as the following sql.
StampedLock Optimistic reading is similar to it. Let’s take a look at its simple usage.
This is where it compares with ReentrantReadWriteLock. Others are not good. For example, StampedLock does not support reentrancy and does not support condition variables. Another point is that when using StampedLock, you must not call the interrupt operation, because it will cause the CPU to be 100%. I ran the example provided on the concurrent programming website and reproduced it.
#The specific reasons will not be described in detail here. A link will be posted at the end of the article. The above is very detailed.
So if something seems to be powerful, you need to really understand it and be familiar with it to be targeted.
Copy-on-write is also used in many places, such as the process fork()
operation. It is also very helpful for our business code level because its read operations do not block writes, and write operations do not block reads. Suitable for scenarios where there is a lot of reading and a little writing.
For example, the implementation in Java CopyOnWriteArrayList
, some people may hear that this thing does not block writing when thread-safe reading, so good guys use it!
You must first understand that Copy on write will copy a copy of the data, and any modification action you make will be triggered once in CopyOnWriteArrayList
Arrays .copyOf
and then modify it on the copy. If there are many modification actions and the copied data is also large, this will be a disaster!
Finally let’s talk about the use of concurrent security containers. I will take the relatively familiar ConcurrentHashMap as an example. I think new colleagues seem to think that as long as they use concurrent safety containers, they must be thread safe. In fact, not necessarily, it depends on how to use it.
Let’s take a look at the following code first. To put it simply, it uses ConcurrentHashMap to record everyone’s salary, up to 100.
The final result will exceed the standard, that is, there are not only 100 people recorded in the map. So how can the result be correct? It's as simple as adding a lock.
After seeing this, some people said, why should I use ConcurrentHashMap if you have locked it? I can just add a lock to HashMap and it will be fine! Yes you are right! Because our current usage scenario is a compound operation, that is, we first judge the size of the map, and then execute the put method. ConcurrentHashMap cannot guarantee that compound operations are thread-safe!
ConcurrentHashMap is suitable only for the thread-safe methods exposed by it, rather than for compound operations. For example, the following code
Of course, my example is not appropriate enough. In fact, the reason why ConcurrentHashMap performance is higher than HashMap lock is due to segmentation lock, which requires multiple key operations to be reflected. However, the key point I want to highlight is that you should not be careless when using it, and you cannot simply think that using it will make it thread-safe.
Today we talked about the sources of concurrency bugs, namely three major issues: visibility issues, atomicity issues and ordering Sexual issues. Then I briefly talked about the key points of the synchronized keyword, that is, modifying static fields or static methods is a class-level lock, while modifying non-static fields and non-static methods is an instance-level class.
Let’s talk about the granularity of locks. Defining different locks in different scenarios cannot be done with just one lock, and the granularity of the internal locks of the method must be fine. For example, in scenarios where there is a lot of reading and a lot of writing, you can use read-write locks, copy-on-write, etc.
Ultimately, we must use concurrent safety containers correctly. We cannot blindly think that using concurrent safety containers must be thread safe. We must pay attention to the scenario of compound operations.
Of course I just talked about it briefly today. There are actually many points about concurrent programming. It is not easy to write thread-safe code, just like I did before The entire process of Kafka event processing analyzed is the same. The original version was all about various locks to control concurrency security. Later, bugs could not be fixed at all. Multi-thread programming was difficult, debugging was difficult, and bug fixing was also difficult.
Therefore, the Kafka event processing module was finally changed to Single-threaded event queue mode, Abstract access related to shared data competition into events, and stuff the events into the blocking queue , and then single-thread processing.
So before using a lock, we have to think about it. Is it necessary? Can it be simplified? Otherwise, you will know how painful it will be to maintain later.
The above is the detailed content of Did you use the right lock? A brief discussion on Java 'lock” matters. For more information, please follow other related articles on the PHP Chinese website!