Home  >  Article  >  Backend Development  >  Analysis of .NET Garbage Collection (GC) Principles

Analysis of .NET Garbage Collection (GC) Principles

怪我咯
怪我咯Original
2017-04-05 13:33:441466browse

As part of the advanced content of .NET, the garbage collector (GC for short) is something that must be understood. In line with the principle of "easy to understand", this article will explain the working principle of the garbage collector in the CLR.

Basic knowledge

Managed Heap

Let’s first look at MSDN’s explanation: When initializing a new process, it will be reserved for the process at runtime A contiguous region of address space. This reserved address space is called the managed heap.

"Managed heap is also a heap", why do you say this? I say this because I hope everyone will not be confused by "terminology". The premise of this knowledge point is "the difference between value types and reference types". It is assumed here that the reader already knows the important concept of "value types are stored on the stack, and reference types are stored on the heap. (References of reference types are stored on the stack)". So, according to this theory, the CLR requires that all resources, except value types, be allocated from the managed heap.

The managed heap maintains a pointer, here named NextObjPtr, which points to the allocation location of the next object in the heap.

CPU Register (CPU Register)

This is basic computer knowledge. Review it here to help understand the following "root" concepts.

CPU registers are the CPU's own "temporary memory", which is faster than memory access. Divided by distance from the CPU, the closest ones are registers, then cache (computer level one, level two, and level three cache), and finally memory.

Roots

Any static fields defined in the class, method parameters, local variables (reference type variables only), etc. They are all roots, and the object pointers in the CPU registers are also roots. Roots are the various entry points that the CLR can find outside of the heap.

Objects reachable and unreachable

If a root refers to an object in the heap, the object is "reachable" "reachable", otherwise it is "unreachable".

The reason for garbage collection

From the perspective of computer composition, all programs must reside in memory and run. And memory is a limiting factor (size). In addition to this, the managed heap also has size limits. If there is no size limit on the managed heap, the execution speed of C# is better than that of c (the structure of the managed heap allows it to allocate objects faster than the c runtime heap). Due to address space and storage limitations, the managed heap must maintain its normal operation through a garbage collection mechanism to ensure that objects are allocated without "memory overflow".

Basic principles of garbage collection

Recycling is divided into two stages: Marking –> Compression

The process of marking is actually the process of determining whether the object is reachable. When all roots have been checked, the heap will contain reachable (marked) and unreachable (unmarked) objects.

After the marking is completed, enter the compression stage. During this phase, the garbage collector linearly traverses the heap to find contiguous blocks of memory for unreachable objects. And move reachable objects here to compact the heap. This process is somewhat similar to defragmenting disk space.

As shown in the figure above, the green box indicates reachable objects, and the yellow box indicates unreachable objects. After unreachable objects are cleared, moving reachable objects achieves memory compression (becoming more compact).

After compaction, the variables and CPU registers "pointers to these objects" are now invalid, and the garbage collector must revisit all roots and modify them to point to the objects' new memory locations. This can cause significant performance loss. This loss is also the main disadvantage of the managed heap.

Based on the above characteristics, the recycling algorithm caused by garbage collection is also a research topic. Because if you really wait until the managed heap is full before starting garbage collection, it will be really "slow".

Garbage collection algorithm - Generation algorithm

Generation is a mechanism used by the CLR garbage collector. Its only purpose is to improve the performance of the application. Generational recycling is obviously faster than recycling the entire heap.

The CLR managed heap supports 3 generations: Generation 0, Generation 1, and Generation 2. The space of generation 0 is about 256KB, generation 1 is about 2M, and generation 2 is about 10M. The newly constructed object will be allocated to generation 0,

#As shown in the figure above, When the space of generation 0 is full, the garbage collector starts recycling, unreachable objects (C and E in the figure above) will be recycled, and surviving objects will be classified as the first generation.

When the 0th generation space is full and the 1st generation begins to have many unreachable objects so that the space is almost full, both generations of garbage will be recycled. For surviving objects (reachable objects), generation 0 is promoted to generation 1, and generation 1 is promoted to generation 2.

The actual CLR generation collection mechanism is more "intelligent". If the newly created object has a short life cycle, the 0th generation garbage will be recycled immediately by the garbage collector (without waiting for the space to be fully allocated). In addition, if generation 0 is recycled and it is found that there are still many objects "reachable" and

has not released much memory, the budget of generation 0 will be increased to 512KB, and the recycling effect will be transformed into: The number of garbage collections will be reduced, but a large amount of memory will be reclaimed each time. If not much memory has been released, the garbage collector will perform

full recycling (3 generations). If it is still not enough, a "memory overflow" exception will be thrown.

In other words, the garbage collector will dynamically adjust the allocated space budget of each generation based on the size of the recovered memory! Achieve automatic optimization!

Summary

There is a basic idea behind garbage collection: Programming languages (most) always seem to have access to unlimited memory. And developers can keep allocating, allocating, and allocating—like magic, inexhaustible.

The basic working principle of the .NET garbage collector is: clear unreachable objects through the most basic mark and clear principle; then compress and organize available memory like disk defragmentation; finally achieve the highest performance through generational algorithm optimization.

The above is the detailed content of Analysis of .NET Garbage Collection (GC) Principles. For more information, please follow other related articles on the PHP Chinese website!

Statement:
The content of this article is voluntarily contributed by netizens, and the copyright belongs to the original author. This site does not assume corresponding legal responsibility. If you find any content suspected of plagiarism or infringement, please contact admin@php.cn