Home >Java >javaTutorial >Java garbage collection mechanism
Detailed explanation of Java garbage collection mechanism
At first glance, what garbage collection does should be exactly as its name suggests - find and remove garbage. In fact it's quite the opposite. Garbage collection keeps track of all objects that are still in use and marks the remaining objects as garbage. With this in mind, let's take a deeper look at how this automated memory recycling called "garbage collection" is implemented in the JVM.
Manual memory management
Before introducing the modern version of garbage collection, let's briefly review the days when memory needed to be explicitly allocated and released manually. If you forget to release the memory, then this memory cannot be reused. This memory is occupied but not used. This scenario is called a memory leak.
The following is a simple example of manual memory management written in C:
int send_request() { size_t n = read_size(); int *elements = malloc(n * sizeof(int)); if(read_elements(n, elements) < n) { // elements not freed! return -1; } // … free(elements) return 0; }
As you can see, you can easily forget Free up memory. Memory leaks used to be a very common problem. You can only fight them by constantly fixing your code. Therefore, there is a need for a more elegant way to automatically release unused memory in order to reduce the possibility of human error. This automated process is also called garbage collection (GC for short).
Smart pointers
An early implementation of automatic garbage collection is reference counting. You know how many times each object has been referenced. When the counter reaches 0, the object can be safely recycled. C++'s shared pointer is a very famous example:
int send_request() { size_t n = read_size(); stared_ptr<vector<int>> elements = make_shared<vector<int>>(); if(read_elements(n, elements) < n) { return -1; } return 0; }
The sharedptr we use will record the number of times this object is referenced. The count is incremented by one if you pass it to someone else, and decremented by one when it goes out of scope. Once this count reaches 0, sharedptr will automatically delete the underlying corresponding vector. Of course, this is just an example, as some readers have pointed out that this is unlikely to happen in reality, but it is enough as a demonstration.
Automatic Memory Management
In the above C++ code, we also have to explicitly state that we need to use memory management. So what would happen if all objects used this mechanism? That's so convenient, so developers don't have to think about cleaning up memory. The runtime will automatically know which memory is no longer used and release it. In other words, it automatically recycles the garbage. The first generation garbage collector was introduced in Lisp in 1959, and the technology has continued to evolve to this day.
Reference Counting
The idea we just demonstrated using C++’s shared pointer can be applied to all objects. Many languages, such as Perl, Python and PHP, use this approach. This can be easily explained through a picture:
The green cloud represents the objects that are still used in the program. From a technical level, this is a bit like a local variable in a method being executed, or a static variable. The situation may vary between programming languages, so this is not our focus.
The blue circles represent objects in memory, and you can see how many objects reference them. Objects in gray circles are no longer referenced by anyone. Therefore, they are garbage objects and can be cleaned up by the garbage collector.
Looks pretty good, right? Yes, but there is a major flaw. It is easy for some isolated rings to appear. The objects in them are not in any domain, but they refer to each other, resulting in a non-zero reference number. Here is an example:
As you can see, the red part is actually a garbage object that is no longer used by the application. Due to a flaw in reference counting, there will be a memory leak.
There are several ways to solve this problem, such as using special "weak" references, or using a special algorithm to recycle circular references. The previously mentioned languages such as Perl, Python and PHP all use similar methods to recycle circular references, but this is beyond the scope of this article. We are going to introduce in detail the method used by the JVM.
Mark deletion
First of all, the JVM's definition of object reachability should be clearer. It is not as vague as before with a green cloud, but has a very clear and specific definition of the garbage collection root object (Garbage Collection Roots):
local variable
Activity Thread
Static Field
JNI Reference
Others (will be discussed later)
JVM records all reachable (live) objects through the mark and delete algorithm, while ensuring the memory of unreachable objects Can be reused. This involves two steps:
Marking refers to traversing all reachable objects and then recording the information of these objects in local memory
Delete Will ensure that the memory address of the unreachable object can be used in the next memory allocation.
Different GC algorithms in the JVM, such as Parallel Scavenge, Parallel Mark+Copy, and CMS are all different implementations of this algorithm, but each stage is slightly different. Conceptually, they are still the same Corresponds to the two steps mentioned above.
The most important thing about this implementation is that there will no longer be leaked object rings:
The disadvantage is that the application thread needs to be suspended to complete the recycling. If the reference keeps changing, you are Unable to count. The situation where the application is paused so that the JVM can take care of its chores is also known as Stop The World pause (STW). There are many possibilities for this pause to be triggered, but garbage collection is probably the most common one.
Thank you for reading, I hope it can help you, thank you for your support of this site!
For more articles related to Java garbage collection mechanism, please pay attention to the PHP Chinese website!