In the Java virtual machine, the memory of objects and arrays is allocated in the heap, and the main memory recovered by the garbage collector is in the heap memory. If dynamically created objects or arrays are not recycled in time during the running of a Java program and continue to accumulate, eventually the heap memory will be full, leading to OOM.
JVM provides a garbage collection mechanism, referred to as the GC mechanism. Through the GC mechanism, garbage objects in the heap can be continuously recycled during operation, thereby ensuring the normal operation of the program.
We all know that the so-called "garbage" objects refer to objects that are no longer useful during the running of our program, that is, objects that are no longer alive. So how to judge whether the objects in the heap are "garbage" or objects that are no longer alive?
Each object has a reference count attribute, which is used to save the number of times the object has been referenced. When the number of references is 0, it means that the object has not been referenced, and the object will not be used, and it can be determined as a garbage object. However, this method has a big bug, which is that it cannot solve the problem of mutual references or circular references between objects: when two objects refer to each other, they have no reference relationship with any other objects, and they have the same number of references. is not 0, so it will not be recycled, but in fact these two objects are no longer useful.
In order to avoid the problems caused by using reference counting, Java uses the reachability analysis method to determine garbage objects.
This method can imagine the reference relationship of all objects as a tree. From the root node GC Root of the tree, all referenced objects are traversed. The nodes of the tree are reachable objects, and other objects that are not at the node are is an unreachable object.
So what kind of object can be used as the root node of GC?
Objects referenced in the virtual machine stack (local variable table in the frame stack)
Objects referenced by static properties in the method area
Objects referenced by constants in the method area
Objects referenced by JNI in the local method stack
The garbage collection mechanism, whether it is the reference counting method or the reachability analysis method, is related to the reference of the object. There are four reference states in Java:
Strong References - Most of the references we use are actually strong references, which are the most commonly used references. If an object has a strong reference, it means that it is in a reachable state and the garbage collector will never reclaim it. Even if the system memory is very tight, the Java virtual machine would rather throw an OutOfMemoryError
error to cause the program to terminate abnormally. Objects referenced by strong references will not be recycled. Therefore, strong references are one of the main causes of Java memory leaks.
Soft reference - An object only has a soft reference. If the memory space is enough, the garbage collector will not reclaim it. If the memory space is insufficient, the memory of these objects will be reclaimed. As long as the garbage collector does not collect it, the object can be used by the program.
Weak reference - An object only has a weak reference, which is similar to being dispensable. Weak references are similar to soft references, but have a lower reference level. The difference between weak references and soft references is that objects with only weak references have a shorter life cycle. During the process of the garbage collector thread scanning the memory area under its jurisdiction, once an object with only weak references is found, its memory will be reclaimed regardless of whether the current memory space is sufficient.
Virtual reference - An object holds only a phantom reference, then it may be recycled by the garbage collector at any time as if it has no reference. Virtual references are mainly used to track the activities of objects being garbage collected, and we generally do not use them.
Through the reachability analysis algorithm, it can be determined which objects need to be recycled. So how does recycling need to be performed?
First, you need to mark the object memory that can be recycled, and then clear the recycled memory.
Marking-clearing algorithm (before recycling)
Marking-clearing algorithm (after recycling)
But in this case, as the program runs, memory will be continuously allocated and released, and a lot of discontinuous free memory areas will be generated in the heap, that is, memory fragments. In this way, even if there is enough free memory, it may not be able to allocate large enough memory, and may cause frequent GC, affecting efficiency, or even OOM.
The difference from the Mark-Clear algorithm is that the Mark-Complete algorithm does not directly clean up the recyclable memory after marking, but moves all surviving objects to one end and then clears the recyclable memory. Reclaim memory.
Mark - Collation algorithm (before recycling)
Mark - Collation algorithm (after recycling)
The advantage of this is that it will not cause memory fragmentation.
The copy algorithm needs to first divide the memory into two blocks, and first allocate memory on one of the memory blocks. When this block of memory is allocated, garbage collection is performed, and then All surviving objects are copied to another piece of memory, and the first piece of memory is cleared.
Copy algorithm (before recycling)
Copy algorithm (after recycling)
This kind The algorithm does not produce memory fragmentation, but it is equivalent to using only half of the memory space. At the same time, the replication algorithm is related to the number of surviving objects. If the number of surviving objects is large, the efficiency of the replication algorithm will be greatly reduced.
In the Java virtual machine, the life cycle of objects may be long or short. The life cycle of most objects is very short, and only a small number of objects will be in memory. It persists for a long time, so objects can be placed in different areas based on their life cycle. In the Java virtual machine heap that uses the generational collection algorithm, it is generally divided into three areas, used to store these three types of objects respectively:
New generation - newly created objects, in When the code is running, new objects will generally be continuously created. Many of these newly created objects are local variables and will soon become garbage objects. These objects are placed in a memory area called the young generation. The new generation is characterized by many garbage objects and few surviving objects.
Old generation - Some objects were created very early and have not been recycled after multiple GCs, but have always survived. These objects are placed in an area called the old generation. The characteristic of the old generation is that there are many surviving objects and few garbage objects.
Permanent generation - some objects that exist permanently with the life cycle of the virtual machine, such as some static objects, constants, etc. These objects are placed in an area called the permanent generation. The characteristic of the permanent generation is that these objects generally do not require garbage collection and will survive while the virtual machine is running. (Before Java 1.7, the permanent generation objects were stored in the method area. The permanent generation objects in the Java 1.7 method area were moved to the heap. In Java 1.8, the permanent generation has been removed from the heap. This memory is The metaspace.)
The generational collection algorithm also performs garbage collection based on the new generation and the old generation.
For the new generation area, many garbage objects will be recycled in each GC, and only a few will survive. Therefore, the copy recycling algorithm is used, and the few remaining surviving objects can be copied during GC.
In the new generation area, copying and recycling are not carried out according to the ratio of 1:1, but divided into three areas: Eden, SurvivorA, and SurvivorB according to the ratio of 8:1:1. Among them, Eden means the Garden of Eden, describing the many new objects created in it; the Survivor area refers to the survivors, that is, the objects that still survive after experiencing GC.
The Eden area provides heap memory to the outside world. When the Eden area is almost full, Minor GC (New Generation GC) is performed, the surviving objects are put into the SurvivorA area, and the Eden area is cleared;
After the Eden area is cleared, it continues to be provided to the outside world. Heap memory;
When the Eden area is filled again, Minor GC (new generation GC) is performed on the Eden area and the SurvivorA area at the same time, and the surviving objects are put into the SurvivorB area. Clear the Eden area and SurvivorA area at the same time;
The Eden area continues to provide heap memory to the outside world, and repeats the above process, that is, after the Eden area is filled, the Eden area and a certain Survivor area The surviving objects are placed in another Survivor area;
When a Survivor area is filled and there are still objects that have not been copied, or some objects are repeatedly Survive 15 times When the old generation area is full, the remaining objects are placed in the old generation area; when the old generation area is also filled, Major GC (old generation GC) is performed to perform garbage collection on the old generation area.
Objects in the old generation area generally have a long survival period. During each GC, there are more surviving objects, so the mark-collation algorithm is used. A small number of surviving objects are moved during GC without causing Memory fragmentation.
The Java virtual machine will print out information about each GC trigger, and you can analyze the reasons for triggering GC based on the log.
GC_FOR_MALLOC: Indicates that the GC is triggered by insufficient memory when allocating objects on the heap.
GC_CONCURRENT: When the heap memory of our application reaches a certain amount, or can be understood to be almost full, the system will automatically trigger a GC operation to release the memory.
GC_EXPLICIT: Indicates that the GC is triggered when the application calls the System.gc, VMRuntime.gc interface or receives the SIGUSR1 signal.
GC_BEFORE_OOM: Indicates that the GC is triggered by the last effort before preparing to throw an OOM exception.
The above is the detailed content of Detailed graphic and text explanation of virtual machine garbage collection mechanism in Java. For more information, please follow other related articles on the PHP Chinese website!