Home >Java >javaTutorial >An in-depth explanation of Java garbage collection
1. The core idea of the garbage collection algorithm
The Java language has established a garbage collection mechanism to track objects in use and discover and recycle objects that are no longer used (referenced). This mechanism can effectively prevent two dangers that may occur in dynamic memory allocation: memory exhaustion caused by excessive memory garbage, and illegal memory references caused by improper memory release.
The core idea of the garbage collection algorithm is to identify the objects in the virtual machine's available memory space, that is, the heap space. If the object is being referenced, it is called a living object. On the contrary, if the object is no longer referenced, it is called a living object. A reference is a garbage object, and the space it occupies can be recycled for reallocation. The choice of garbage collection algorithm and the reasonable adjustment of garbage collection system parameters directly affect system performance, so developers need to have a deeper understanding.
2. Conditions for triggering the main GC (Garbage Collector)
The JVM performs secondary GC very frequently, but because this GC takes up very little time, it has little impact on the system. What deserves more attention is the triggering condition of the main GC, because it has a significant impact on the system. In general, there are two conditions that will trigger the main GC:
(1) When the application is idle, that is, when no application thread is running, GC will be called. Because GC is performed in the thread with the lowest priority, the GC thread will not be called when the application is busy, except for the following conditions.
(2) When the Java heap memory is insufficient, GC will be called. When the application thread is running and creates new objects during the running process, if there is insufficient memory space at this time, the JVM will forcefully call the GC thread to reclaim memory for new allocations. If the memory allocation requirements cannot be met after one GC, the JVM will perform two more GC attempts for further attempts. If the requirements are still unable to be met, the JVM will report an "out of memory" error and the Java application will stop.
Since whether to perform main GC is decided by the JVM according to the system environment, and the system environment is constantly changing, the operation of the main GC is uncertain, and it is impossible to predict when it will inevitably occur, but what is certain is that For a long-running application, the main GC is performed repeatedly.
3. Measures to reduce GC overhead
According to the above GC mechanism, the running of the program will directly affect the changes in the system environment, thereby affecting the triggering of GC. If you do not design and code according to the characteristics of GC, there will be a series of negative effects such as memory retention. In order to avoid these effects, the basic principle is to reduce garbage and reduce the overhead in the GC process as much as possible. Specific measures include the following aspects:
(1) Do not explicitly call System.gc()
This function recommends that the JVM performs main GC. Although it is only a suggestion and not a guarantee, in many cases it will Trigger the main GC, thereby increasing the frequency of the main GC, that is, increasing the number of intermittent pauses. What needs special explanation here is that the call to System.gc() shown in the code may not necessarily be able to perform GC. We can verify this through the finalize() method, that is, actively calling System.gc(), not necessarily every time. The finalize() method is called every time. The characteristic of the finalize() method is that the finalize() method is first called before the object is recycled.
(2) Minimize the use of temporary objects
Temporary objects will become garbage after jumping out of function calls. Using less temporary variables is equivalent to reducing the generation of garbage, thereby prolonging the second problem mentioned above. The time when a trigger condition occurs reduces the chance of main GC.
(3) It is best to explicitly set Null when the object is not in use
Generally speaking, Null objects will be treated as garbage, so it is beneficial to explicitly set unused objects to Null. The GC collector determines garbage, thereby improving the efficiency of GC.
(4) Try to use StringBuffer instead of String to accumulate strings (see another blog article on String and StringBuffer in JAVA for details)
Since String is a fixed-length string object, accumulate String objects When, instead of expanding in a String object, a new String object is re-created, such as Str5=Str1+Str2+Str3+Str4. Multiple garbage objects will be generated during the execution of this statement, because the "+" "New String objects must be created during operations, but these transition objects have no practical significance for the system and will only add more garbage. To avoid this situation, you can use StringBuffer to accumulate strings. Because StringBuffer is variable-length, it expands on the original basis and does not produce intermediate objects.
(5) If you can use basic types such as Int and Long, do not use Integer or Long objects.
Basic type variables occupy much less memory resources than the corresponding objects. If it is not necessary, it is best to use Basic variables. When do you need to use Integer?
(6) Use static object variables as little as possible
Static variables are global variables and will not be recycled by GC. They will always occupy memory.
(7) Distribute the time of object creation or deletion
Concentrating the creation of a large number of new objects in a short period of time, especially large objects, will result in a sudden need for a large amount of memory. When faced with this situation, the JVM only Main GC can be performed to reclaim memory or consolidate memory fragments, thereby increasing the frequency of main GC. The same principle applies to centralized deletion of objects. It causes a large number of garbage objects to suddenly appear, and the free space will inevitably decrease, thereby greatly increasing the chance of forcing the main GC the next time a new object is created.
4. Garbage collection algorithm
(1) Reference counting collector
Reference counting is an early strategy for garbage collection. In this approach, every object in the heap has a reference count. When an object is created and a reference to the object is assigned to a variable, the object's reference count is set to 1. For example, create a new object A a=new A(); and then a is assigned to another variable b, that is, b=a; then the reference count of object a is +1. When any other variable is assigned a reference to this object, the count is incremented by one. When a reference to an object exceeds the lifetime or is set to a new value, the object's reference count is decremented by 1. For example, if b=c, the reference count of a is -1. Any object with a reference count of 0 can be garbage collected. When an object is garbage collected, the count of any objects it references is decremented by one. In this approach, garbage collection of one object may lead to subsequent garbage collection actions for other objects. For example, A a=new A();b=a; when b is garbage collected, the reference count of a becomes 0, which causes a to also be garbage collected.
Benefits of the method: The reference counting collector can be executed quickly and is intertwined with the running of the program. This reminder is beneficial in real-time environments where the program cannot be interrupted for long periods of time.
Disadvantages of the method: Reference counting cannot detect cycles (that is, two or more objects refer to each other). An example of a loop is that a parent object has a reference to a child object, and the child object in turn references the parent object. In this way, it is impossible for object users to have a count of 0, even if they are no longer reachable by the root object of the executing program. Another disadvantage is that every increase or decrease in the reference count brings additional overhead.
(2) Tracking Collector
Garbage detection is usually implemented by establishing a collection of root objects and checking the reachability starting from these root objects. An object is reachable if there is a reference path between the root object that is accessible to the executing program and the object. The root object is always accessible to the program. Starting from these root objects, any object that can be touched is considered an "active" object. Objects that cannot be touched are considered garbage because they no longer affect future execution of the program.
The tracking collector tracks the object reference graph starting from the root node. Objects encountered during tracking are marked in a certain way. In general, either set the marker on the object itself, or use a separate bitmap to set the marker. When tracking ends, unmarked objects are unreachable and can be collected.
The basic tracking algorithm is called "mark and clear". The name points out the two stages of junk cell phones. During the marking phase, the garbage collector traverses the reference tree and marks each encountered object. During the cleanup phase, unmarked objects are released, and the memory obtained after releasing the objects is returned to the executing program. In the Java virtual machine, the cleanup step must include the finalization of the object.
For more in-depth and detailed explanations of Java garbage collection and related articles, please pay attention to the PHP Chinese website!