Home  >  Article  >  Java  >  Understand Java garbage collection mechanism

Understand Java garbage collection mechanism

WBOY
WBOYforward
2023-04-24 14:10:07978browse

Before talking about memory sets and card tables, let me first introduce to you the issue of cross-generation reference.

Understand Java garbage collection mechanism

If you want to perform a collection (Minor GC) that is limited to the new generation area, but the instance object 1 of the new generation is referenced in the old generation, in order to find To remove all surviving objects in this area (new generation), in addition to the fixed GC Roots, all objects in the entire old generation have to be additionally traversed to ensure the correctness of the reachability analysis results, and vice versa. Although the solution of traversing all objects in the entire old generation is theoretically feasible, it will undoubtedly bring a large performance burden to memory recycling.

In fact, the problem of cross-generation references is not only between the new generation and the old generation, but also all garbage collectors involving partial area collection (Partial GC) behavior, such as G1, ZGC and Shenandoah collection. All devices will face the same problem.

So how can we solve cross-generation references?

First of all, cross-generational citations account for only a very small number compared to same-generation citations. The reason is that objects referenced across generations should tend to survive or die at the same time (for example: if a new generation object has a cross-generation reference, because the old generation object is difficult to die, this reference will allow the new generation object to be collected when it is collected. Survive, and then be promoted to the old generation as it ages, at which time cross-generation references are also eliminated).

Based on what is said above, we no longer need to scan the entire old generation for a small number of cross-generation references, nor do we need to waste space to record whether each object exists and which cross-generation references exist. We only need to scan the entire old generation for a small number of cross-generation references. Create a global data structure (this structure is called a "Remembered Set"). This structure divides the old generation into several small blocks and identifies which piece of memory in the old generation will have cross-generation references. Afterwards, when Minor GC occurs, only objects in small blocks of memory containing cross-generational references will be added to GCRoots for scanning. Although this method needs to maintain the correctness of the recorded data when the object changes its reference relationship (such as assigning itself or a certain attribute), which will increase some runtime overhead, it is still more cost-effective than scanning the entire old generation during collection. of.

Let’s introduce this global data structure memory set.

Memory set

The memory set is an abstract data structure used to record a set of pointers from the non-collection area to the collection area. If we do not consider efficiency and cost, the simplest implementation can use all object arrays containing cross-generation references in the non-collection area to implement this data structure, as shown in the following code:

//以对象指针来实现记忆集的伪代码
Class RememberedSet {
	Object[] set[OBJECT_INTERGENERATIONAL_REFERENCE_SIZE]; 
}

All such records Implementation solutions containing cross-generation reference objects are quite expensive in terms of space usage and maintenance costs. In a garbage collection scenario, the collector only needs to use the memory set to determine whether a certain non-collection area has a pointer pointing to the collection area. It does not need to know all the details of these cross-generation pointers. When designers implement the memory set, they can choose a rougher record granularity to save the storage and maintenance costs of the memory set. The following lists some record precisions for selection (of course, you can also choose outside this range):

  • Word length precision: Each record is accurate to one machine word length (that is, the processor The number of addressing bits, such as the common 32-bit or 64-bit, this precision determines the length of the pointer used by the machine to access the physical memory address), this word contains the cross-generation pointer.

  • Object precision: Each record is accurate to an object, and there are fields in the object that contain cross-generation pointers.

  • Card precision: Each record is accurate to a memory area, and there are objects in this area that contain cross-generation pointers.

Above, the third type of "card precision" refers to using a method called "Card Table" to implement the memory set, which is currently the most Commonly used implementations of memory sets.

What is the relationship between card list and memory set?

When I introduced the memory set earlier, I mentioned that the memory set is actually an "abstract" data structure. Abstraction means that it only defines the behavioral intention of the memory set and does not define the specific implementation of its behavior. The card table is a specific implementation of the memory set, which defines the recording accuracy of the memory set, the mapping relationship with the heap memory, etc. Regarding the relationship between the memory set and the card table, it can be understood by analogy with the relationship between Map and HashMap in Java (that is, the relationship between interfaces and implementation classes).

Let’s talk about the specific implementation of the memory set in detail. Card table

Card table

The card table is implemented using a byte array CARD_TABLE[], each element corresponds to its The identified memory area is a memory block of a specific size. Each memory block is called a card page. The card page used by hotspot is 2^9 in size, which is 512 bytes. As shown in the figure below

Understand Java garbage collection mechanism

In this way we can divide a certain area according to card pages. If we now want to garbage collect the new generation area, then we can The era area is divided into card pages one by one, as shown in the figure below.

Understand Java garbage collection mechanism

As shown in the figure, because there is a cross-generation reference pointing to the new generation in cardpage1, the first position of the corresponding card table is 1, indicating that there is a cross-generation application in this page area object.

  • Card table angle: Because there are cross-generation drinking objects in page1, the first position corresponding to the card table is recorded as 1, indicating that the page1 element is dirty.

  • Memory recycling perspective: Because the first position of the card table is 1, it indicates that there are cross-generation application objects in the page area, and this area needs to be scanned during garbage collection.

The memory of a card page usually contains more than one object. As long as there is a cross-generation pointer in the field of one (or more) objects in the card page, the corresponding card table will be The value of the array element is marked as 1, which means the element is dirty (Dirty), if not, it is marked as 0. When garbage collection occurs, as long as the dirty elements in the card table are filtered out, you can easily find out which card page memory blocks contain cross-generation pointers, add them to GC Roots and scan them together. This eliminates the need to scan the entire old generation and greatly reduces the scanning range of GC Roots.

The above is the detailed content of Understand Java garbage collection mechanism. For more information, please follow other related articles on the PHP Chinese website!

Statement:
This article is reproduced at:yisu.com. If there is any infringement, please contact admin@php.cn delete