Home >Java >javaTutorial >A brief discussion on some basic concepts of Java's GC garbage collector

A brief discussion on some basic concepts of Java's GC garbage collector

高洛峰
高洛峰Original
2017-01-17 15:53:381391browse

1. Basic recycling algorithm

1. Reference Counting
An older recycling algorithm. The principle is that this object has a reference, which increases a count, and deleting a reference decreases the count. During garbage collection, only objects with a count of 0 are collected. The most fatal thing about this algorithm is that it cannot handle the problem of circular references.
2. Mark-Sweep
This algorithm is executed in two stages. The first phase marks all referenced objects starting from the reference root node, and the second phase traverses the entire heap and clears unmarked objects. This algorithm requires suspending the entire application and, at the same time, causes memory fragmentation.
3. Copying
This algorithm divides the memory space into two equal areas, and only uses one of them at a time. During garbage collection, the currently used area is traversed and the objects in use are copied to another area. This algorithm only processes the objects that are in use each time, so the copy cost is relatively small. At the same time, the corresponding memory can be organized after copying, but there is a "fragmentation" problem. Of course, the disadvantage of this algorithm is also obvious, that is, it requires twice the memory space.
4. Mark-Compact
This algorithm combines the advantages of the "mark-clear" and "copy" algorithms. It is also divided into two phases. The first phase starts from the root node and marks all referenced objects. The second phase traverses the entire heap, clears unmarked objects and "compresses" surviving objects into one of the heaps, and discharges them in order. This algorithm avoids the fragmentation problem of "mark-and-sweep" and also avoids the space problem of the "copy" algorithm.
5. Incremental Collection (Incremental Collecting)
Implement the garbage collection algorithm, that is, perform garbage collection while the application is in progress. I don’t know why the collector in JDK5.0 does not use this algorithm.
6. Generational Collecting

A garbage collection algorithm based on the analysis of the object life cycle. Divide objects into young generation, old generation, and persistent generation, and use different algorithms (one of the above methods) to recycle objects with different life cycles. Current garbage collectors (starting from J2SE1.2) all use this algorithm.

1. Young (young generation)
The young generation is divided into three areas. One Eden area and two Survivor areas. Most objects are generated in the Eden area. When the Eden area is full, the surviving objects will be copied to the Survivor area (one of the two). When this Survivor area is full, the surviving objects in this area will be copied to another Survivor area. When this Survivor is gone, When it is full, objects copied from the first Survivor area and still alive at this time will be copied to the "Tenured" area. It should be noted that the two areas of Survivor are symmetrical and have no sequential relationship, so there may be objects copied from Eden and objects copied from the previous Survivor at the same time in the same area. However, only objects copied from the first area are copied to the old area. The object that Survivor came to. Moreover, one of the Survivor areas is always empty.
2. Tenured (Old Generation)
The old generation stores objects that survive from the young generation. Generally speaking, the old generation stores objects with longer lifetimes.
3. Perm (persistent generation)
Used to store static files, now Java classes, methods, etc. The persistent generation has no significant impact on garbage collection, but some applications may dynamically generate or call some classes, such as Hibernate, etc. In this case, a relatively large persistent generation space needs to be set up to store these newly added classes during operation. The size of the persistent generation is set with -XX:MaxPermSize=.


2. GC Type
There are two types of GC: Scavenge GC and Full GC.

1. Scavenge GC
Generally, when a new object is generated and fails to apply for space in Eden, Scavenge GC is triggered, the Eden area is GCed, non-survival objects are cleared, and the remaining objects are Surviving objects are moved to the Survivor area. Then organize the two areas of Survivor.
2. Full GC
Organize the entire heap, including Young, Tenured and Perm. Full GC is slower than Scavenge GC, so Full GC should be reduced as much as possible. The following reasons may cause Full GC:
* Tenured is filled up
* Perm field is filled up
* System.gc() is explicitly called
* Heap fields are allocated after the last GC Dynamic changes in strategies


Demonstration of generational garbage collection process

1.

A brief discussion on some basic concepts of Javas GC garbage collector

twenty three.

A brief discussion on some basic concepts of Javas GC garbage collector

4.

A brief discussion on some basic concepts of Javas GC garbage collector

2. Garbage collector


There are currently three main types of collectors : Serial collector, parallel collector, concurrent collector.

1. The serial collector

uses a single thread to handle all garbage collection work. Because there is no need for multi-thread interaction, it is more efficient. However, the advantages of multiple processors are not available, so this collector is suitable for single-processor machines. Of course, this collector can also be used on multi-processor machines with small data volumes (about 100M). Can be turned on with -XX:+UseSerialGC.

2. Parallel collector
1. Perform parallel garbage collection on the young generation, thus reducing garbage collection time. Typically used on multi-threaded multi-processor machines. Use -XX:+UseParallelGC. to open. The parallel collector was introduced in the sixth update of J2SE5.0 and was enhanced in Java SE6.0 - it can heap the old generation for parallel collection. If the old generation does not use concurrent collection, it will use a single thread for garbage collection, which will restrict expansion capabilities. Open with -XX:+UseParallelOldGC.
2. Use -XX:ParallelGCThreads= to set the number of threads for parallel garbage collection. This value can be set equal to the number of processors on the machine.
3. This collector can be configured as follows:
* Maximum garbage collection pause: Specify the maximum pause time during garbage collection, specified by -XX:MaxGCPauseMillis=. is milliseconds. If this value is specified, the heap size and garbage collection related parameters will be adjusted to achieve the specified value. Setting this value may reduce your application's throughput.
* Throughput: Throughput is the ratio of garbage collection time to non-garbage collection time, set by -XX:GCTimeRatio=, the formula is 1/(1+N). For example, -XX:GCTimeRatio=19 means that 5% of the time is used for garbage collection. The default is 99, which means 1% of the time is used for garbage collection.

3. Concurrent collector
can ensure that most work is performed concurrently (the application does not stop), and garbage collection is paused for only a small amount of time. This collector is suitable for medium and high-end applications that have high response time requirements. Large scale application. Turn on with -XX:+UseConcMarkSweepGC.
1. The concurrent collector mainly reduces the pause time of the old generation. It uses an independent garbage collection thread to track reachable objects without stopping the application. During each old generation garbage collection cycle, the concurrent collector will briefly pause the entire application at the beginning of the collection, and will pause again during the collection. The second pause will be slightly longer than the first, during which multiple threads are performing garbage collection work at the same time.
2. The concurrent collector uses the processor in exchange for short pause times. On a system with N processors, the concurrent collection part uses K/N available processors for recycling. Generally, 13. Using the concurrent collector on a host with only one processor and setting it to incremental mode can also result in shorter pause times.
4. Floating Garbage: Since garbage collection is performed while the application is running, some garbage may be generated when the garbage collection is completed, resulting in "Floating Garbage". This garbage needs to be collected in the next garbage collection cycle. Lose. Therefore, the concurrent collector generally requires 20% reserved space for these floating garbage.
5. Concurrent Mode Failure: The concurrent collector collects while the application is running, so it is necessary to ensure that the heap has enough space for the program to use during garbage collection. Otherwise, the heap space will be full before the garbage collection is completed. . In this case, a "concurrency mode failure" will occur, and the entire application will be suspended for garbage collection.
6. Start the concurrent collector: Because concurrent collection is collected while the application is running, you must ensure that there is enough memory space for the program to use before the collection is completed, otherwise "Concurrent Mode Failure" will occur. By setting -XX:CMSInitiatingOccupancyFraction= specify how much remaining heap there is to start executing concurrent collection

4. Summary
* Serial processor:
--Applicable situation: data volume Relatively small (about 100M); an application that runs on a single processor and has no requirements on response time.
--Disadvantages: Can only be used for small applications
* Parallel processor:
--Applicable situations: "high requirements for throughput", medium and large applications with multiple CPUs and no requirements for application response time Large applications. Examples: background processing, scientific computing.
--Disadvantage: application response time may be longer
* Concurrent processor:
--Applicable situation: "high requirements on response time", multiple CPUs, high requirements on application response time Medium and large applications. Examples: Web server/application server, telecommunications switching, integrated development environment.

3. Basic principles of GC
GC (Garbage Collection) is the garbage collector in JAVA/.NET.
Java was developed from C++. It abandoned some cumbersome and error-prone things in C++ and introduced the concept of counters. One of them is the GC mechanism (C# borrowed from JAVA)
Programmers are prone to problems In some places, forgotten or wrong memory recycling can lead to instability or even crash of the program or system. The GC function provided by Java can automatically monitor whether the object exceeds the scope to achieve the purpose of automatically recycling memory. The Java language does not provide a way to release allocated memory. Show how to do it. Therefore, Java's memory management is actually the management of objects, including the allocation and release of objects.
For programmers, use the new keyword to allocate objects; when releasing an object, just assign all references to the object to null so that the program can no longer access the object. We call the object "unreachable". The GC will be responsible for reclaiming the memory space of all "unreachable" objects.
For GC, when the programmer creates an object, the GC begins to monitor the address, size and usage of the object. Usually, GC uses a directed graph to record and manage all objects in the heap. In this way, it is determined which objects are "reachable" and which objects are "unreachable". When the GC determines that some objects are "unreachable", the GC is responsible for reclaiming these memory spaces. However, in order to ensure that GC can be implemented on different platforms, the Java specification does not strictly stipulate many behaviors of GC. For example, there are no clear regulations on important issues such as what type of recycling algorithm to use and when to perform recycling. Therefore, different JVM implementers often have different implementation algorithms. This also brings a lot of uncertainty to the development of Java programmers. This article studies several issues related to GC work and strives to reduce the negative impact of this uncertainty on Java programs.

4. GC Generational Division
The Heap in the JVM memory model is divided into two large blocks, one is Young Generation and the other is Old Generation

A brief discussion on some basic concepts of Javas GC garbage collector

1) In Young Generation, there is a space called Eden Space, which is mainly used to store new objects. There are also two Survivor Spaces (from, to). Their sizes are always the same. They are used to store garbage every time. Objects that survive recycling.
2) In Old Generation, it mainly stores memory objects with long life cycles in the application.
3) In the Young Generation block, garbage collection generally uses the Copying algorithm, which is fast. During each GC, the surviving objects are first copied from Eden to a SurvivorSpace. When the Survivor Space is full, the remaining live objects are copied directly to OldGeneration. Therefore, after each GC, the Eden memory block will be cleared.
4) In the Old Generation block, garbage collection generally uses the mark-compact algorithm, which is slower but reduces memory requirements.
5) Garbage collection is divided into multiple levels. Level 0 is full (Full) garbage collection, which will recycle the garbage in the OLD segment; level 1 or above is partial garbage collection, which will only recycle the garbage in Young. Memory overflow usually This occurs when there is still no memory space to accommodate new Java objects after the OLD segment or Perm segment is garbage collected.

5. Incremental GC
Incremental GC (Incremental GC) is a GC that is usually implemented by one or a group of processes in the JVM. It itself also occupies the same heap space as the user program and runs It also takes up CPU.
When the GC process is running, the application stops running. Therefore, when GC runs for a long time, users can feel the pause of the Java program. On the other hand, if GC runs for too short, the object recycling rate may be too low, which means that there are still many objects that should be recycled but have not been recycled. , still takes up a lot of memory. Therefore, when designing GC, a trade-off must be made between pause time and recovery rate. A good GC implementation allows users to define the settings they need. For example, some devices with limited memory are very sensitive to memory usage. They hope that the GC can accurately reclaim memory, and it does not care about the speed of the program. Other real-time online games cannot allow long program interruptions.
Incremental GC uses a certain recycling algorithm to divide a long interruption into many small interruptions. In this way, the impact of GC on user programs is reduced. Although incremental GC may not be as efficient as ordinary GC in terms of overall performance, it can reduce the maximum pause time of the program.
The HotSpot JVM provided by Sun JDK can support incremental GC. The default GC mode of HotSpot JVM is not to use incremental GC. In order to start incremental GC, we must add the -Xincgc parameter when running the Java program.
HotSpot JVM incremental GC is implemented using the Train GC algorithm. Its basic idea is to group (stratify) all objects in the heap according to their creation and usage, and group frequently used and relevant objects. Objects are placed in a group, and the groups are continuously adjusted as the program runs. When GC runs, it always recycles the oldest (rarely accessed recently) objects first. If the entire group is recyclable, the GC will recycle the entire group. In this way, each GC run only recovers a certain proportion of unreachable objects to ensure the smooth running of the program.

For more articles about some basic concepts of Java's GC garbage collector, please pay attention to the PHP Chinese website!

Statement:
The content of this article is voluntarily contributed by netizens, and the copyright belongs to the original author. This site does not assume corresponding legal responsibility. If you find any content suspected of plagiarism or infringement, please contact admin@php.cn