search
HomeBackend DevelopmentC#.Net Tutorial.Net Garbage Collection and Large Object Processing

English original text: Maoni Stephens, compiled by: Zhao Yukai (@玉开Sir)

The CLR garbage collector divides objects according to the size of the space they occupy. There is a big difference in how large objects and small objects are handled. For example, memory defragmentation - moving large objects in memory is expensive. Let's study how the garbage collector handles large objects and what potential impact large objects have on program performance.

Large Object Heap and Garbage Collection

In .Net 1.0 and 2.0, if the size of an object exceeds 85000byte, it is considered A large object. This number is based on experience with performance optimization. When the memory size requested by an object reaches this threshold, it will be allocated on the large object heap. What does this mean? To understand this, we need to understand the .Net garbage collection mechanism.

As most people know, .Net GC collects objects based on "generations". There are three generations of objects in the program, generation 0, generation 1 and generation 2. Generation 0 is the youngest object, and generation 2 objects have the longest survival time. GC collects garbage by generation for performance reasons; usually objects will be recycled in generation 0. For example, in an ASP.NET program, objects associated with each request should be recycled at the end of the request. Objects that have not been recycled will become generation 1 objects; that is to say, generation 1 objects are a buffer between resident memory objects and objects that are about to die.

From a generational perspective, large objects belong to generation 2 objects, because large objects are only processed during generation 2 recycling. When a certain generation of garbage collection is executed, the garbage collection of the younger generation will be executed at the same time. For example: when the 1st generation garbage collection is performed, the objects of the 1st generation and the 0th generation will be recycled at the same time. When the 2nd generation garbage collection is performed, the collection of the 1st generation and the 0th generation will be performed. The

generation is where the garbage collector distinguishes memory areas. Logical view. From a physical storage perspective, objects are allocated on different managed heaps. A managed heap is a memory area allocated by the garbage collector from the operating system (by calling the Windows API VirtualAlloc). When the CLR loads memory, it initializes two managed heaps, a large object heap (LOH – large object heap) and a small object pair (SOH – small object heap).

The memory allocation request is to place the managed object on the corresponding managed heap. If the size of the object is less than 85000 bytes, it will be placed in SOH; otherwise, it will be placed in LOH.

For SOH, the object will enter the next generation after performing a garbage collection. That is to say, if the surviving object will enter the second generation when garbage collection is performed for the first time, if the object is still not garbage collected after the second garbage collection, it will become a second-generation object; The 2nd generation object is the oldest object and will not increase the generation.

When garbage collection is triggered, the garbage collector will defragment the small object heap and move the surviving objects together. As for the large object heap, due to the high cost of moving memory, the CLR team chose to just clear them and form a list of recycled objects to meet the next large object request to use memory. Adjacent garbage objects will be merged into A free block of memory.

It should always be noted that until .Net 4.0, the large object heap will not be defragmented, but it may be done in the future. So if you want to allocate large objects and don't want them to be moved, you can use the fixed statement.

The following is a schematic diagram of the recycling of the small object heap SOH



##Before the first garbage collection in the above picture There are four objects obj0-3; after the first garbage collection, obj1 and obj3 were collected, and obj2 and obj0 were moved together; before the second garbage collection, three objects obj4-6 were allocated; in the second After performing garbage collection for the first time, obj2 and obj5 were recycled, and obj4 and obj6 were moved next to obj0.

The following picture is a schematic diagram of large object heap LOH recycling



You can see that garbage collection is not performed Before, there were four objects obj0-3; after the first second-generation garbage collection, obj1 and obj2 were recycled. After the recycling, the spaces occupied by obj1 and obj2 were merged together. When obj4 applied for memory allocation, obj1 was The space released after recycling and obj2 is allocated to it; at the same time, a memory fragment is left. If the size of this fragment is less than 85000 bytes, then this fragment can never be used again during the life cycle of this program.

If there is not enough free memory on the large object heap to accommodate the large object space to be applied for, the CLR will first try to apply for memory from the operating system. If the application fails, it will trigger a second-generation recycling to try to release some memory. .

During 2nd generation garbage collection, unnecessary memory can be returned to the operating system through VirtualFree. Please refer to the figure below for the return process:



When should large objects be recycled?

Before discussing when to recycle large objects, let’s take a look at when ordinary garbage collection operations are performed. Garbage collection occurs under the following circumstances:

1. The requested space exceeds the memory size of generation 0 or the threshold of the large object heap. Most managed heap garbage collection occurs in this case

2 . When the GC.Collect method is called in the program code; if the GC.MaxGeneration parameter is passed in when the GC.Collect method is called, garbage collection of all generation objects will be performed, including garbage collection of the large object heap

3. When the operating system has insufficient memory, when the application receives a high memory notification from the operating system

4. If the garbage collection algorithm believes that second-generation recycling is effective, it will trigger second-generation garbage collection

5. Each generation of object heap has an attribute that occupies a space size threshold. When you allocate objects to a certain generation, you increase the total amount of memory close to the threshold of that generation, or allocate objects that cause this generation to When the heap size exceeds the heap threshold, a garbage collection will occur. Therefore, when you allocate small objects or large objects, it will consume the threshold of the generation 0 heap or the large object heap. When the garbage collector increases the object generation to generation 1 or 2, the threshold of generations 1 and 2 will be consumed. These thresholds change dynamically while the program is running.

Performance impact of large object heap

Let us first look at the cost of allocating large objects. When the CLR allocates memory for each new object, it must ensure that the memory is cleared and not used by other objects (I give out is cleared). This means that the cost of allocation is completely controlled by the cost of clearing (unless a garbage collection is triggered during allocation). If it takes 2 cycles to clear 1 byte, it means that it takes 170,000 cycles to clear a smallest large object. Normally people do not allocate very large objects. For example, allocating a 16M object on a 2GHz machine takes about 16ms to clear the memory. The price is too high.

Let’s take a look at the cost of recycling. As mentioned earlier, large objects are recycled together with 2-generation objects. If the space occupied by a large object or a second-generation object exceeds its threshold, the recycling of the second-generation object will be triggered. If generation 2 recycling is triggered because the large object heap exceeds the threshold, there are not many objects in the generation 2 object heap itself that can be recycled. This is not a big problem if there are not many objects on the 2nd generation heap. However, if the second-generation heap is large and has many objects, excessive second-generation recycling will cause performance problems. If you allocate large objects temporarily, it will take a lot of time to run garbage collection; that is, if you continue to use large objects and then release the large objects, it will have a great negative impact on performance.

Huge objects on the large object heap are usually arrays (it is rare that one object is very large). If the elements in the object are strong references, the cost will be very high; if there are no mutual references between elements, there is no need to traverse the entire array during garbage collection. For example: use an array to save the nodes of a binary tree. One way is to strongly reference the left and right nodes in the node:

class Node
{
Data d;
Node left;
Node right;
}
 
Node[] binaryTree = new Node[num_nodes];

If num_nodes is a large number, it means that each node has at least There are two reference elements that need to be viewed. An alternative is to save the array index numbers of the left and right node elements in the node


class Node
{
Data d;
uint left_index;
uint right_index;
}

In this case, the reference relationship between the elements is removed; you can use binaryTree [left_index] to get the referenced node. The garbage collector no longer needs to look at related reference elements when doing garbage collection.

Collecting performance data for large object heaps

There are several ways to collect performance data related to large object heaps. Before I explain these methods, let's talk about why you need to collect performance data related to large object heaps.

When you start to collect performance data in a certain aspect, it is possible that you have already found evidence of a performance bottleneck in this aspect; or you have not searched all aspects and found no problem.

The .Net CLR Memory performance counters are usually the first tool you should consider when looking for performance problems. Counters related to LOH include generation 2 collectioins (number of generation 2 heap collections) and large object heap size. Generation 2 collections shows the number of generation 2 garbage collection operations that have occurred since the process was started. The Large object heap size counter displays the current size of the large object heap, including free space; this counter is updated after each garbage collection operation, not every time memory is allocated.

You can refer to the figure below to observe .Net CLR Memory related performance data in the windows performance counter


You can also query the values ​​of these counters through programs; many people collect performance counters through programs to help find performance bottlenecks.

Of course, you can also use the debugger winddbg to observe the large object heap.

Final reminder: So far, the large object heap is not defragmented as part of garbage collection, but this is just an implementation detail of clr, and program code should not rely on this feature. If you want to ensure that the object will not be moved by the garbage collector, use the fixed statement.

Original address: http://www.php.cn/

## The above is the .Net garbage collection and Regarding the content of large object processing, please pay attention to the PHP Chinese website (www.php.cn) for more related content!



Statement
The content of this article is voluntarily contributed by netizens, and the copyright belongs to the original author. This site does not assume corresponding legal responsibility. If you find any content suspected of plagiarism or infringement, please contact admin@php.cn
How to use char array in C languageHow to use char array in C languageApr 03, 2025 pm 03:24 PM

The char array stores character sequences in C language and is declared as char array_name[size]. The access element is passed through the subscript operator, and the element ends with the null terminator '\0', which represents the end point of the string. The C language provides a variety of string manipulation functions, such as strlen(), strcpy(), strcat() and strcmp().

What is the role of char in C stringsWhat is the role of char in C stringsApr 03, 2025 pm 03:15 PM

In C, the char type is used in strings: 1. Store a single character; 2. Use an array to represent a string and end with a null terminator; 3. Operate through a string operation function; 4. Read or output a string from the keyboard.

How to handle special characters in C languageHow to handle special characters in C languageApr 03, 2025 pm 03:18 PM

In C language, special characters are processed through escape sequences, such as: \n represents line breaks. \t means tab character. Use escape sequences or character constants to represent special characters, such as char c = '\n'. Note that the backslash needs to be escaped twice. Different platforms and compilers may have different escape sequences, please consult the documentation.

How to use various symbols in C languageHow to use various symbols in C languageApr 03, 2025 pm 04:48 PM

The usage methods of symbols in C language cover arithmetic, assignment, conditions, logic, bit operators, etc. Arithmetic operators are used for basic mathematical operations, assignment operators are used for assignment and addition, subtraction, multiplication and division assignment, condition operators are used for different operations according to conditions, logical operators are used for logical operations, bit operators are used for bit-level operations, and special constants are used to represent null pointers, end-of-file markers, and non-numeric values.

The difference between multithreading and asynchronous c#The difference between multithreading and asynchronous c#Apr 03, 2025 pm 02:57 PM

The difference between multithreading and asynchronous is that multithreading executes multiple threads at the same time, while asynchronously performs operations without blocking the current thread. Multithreading is used for compute-intensive tasks, while asynchronously is used for user interaction. The advantage of multi-threading is to improve computing performance, while the advantage of asynchronous is to not block UI threads. Choosing multithreading or asynchronous depends on the nature of the task: Computation-intensive tasks use multithreading, tasks that interact with external resources and need to keep UI responsiveness use asynchronous.

How to convert char in C languageHow to convert char in C languageApr 03, 2025 pm 03:21 PM

In C language, char type conversion can be directly converted to another type by: casting: using casting characters. Automatic type conversion: When one type of data can accommodate another type of value, the compiler automatically converts it.

What is the function of C language sum?What is the function of C language sum?Apr 03, 2025 pm 02:21 PM

There is no built-in sum function in C language, so it needs to be written by yourself. Sum can be achieved by traversing the array and accumulating elements: Loop version: Sum is calculated using for loop and array length. Pointer version: Use pointers to point to array elements, and efficient summing is achieved through self-increment pointers. Dynamically allocate array version: Dynamically allocate arrays and manage memory yourself, ensuring that allocated memory is freed to prevent memory leaks.

Avoid errors caused by default in C switch statementsAvoid errors caused by default in C switch statementsApr 03, 2025 pm 03:45 PM

A strategy to avoid errors caused by default in C switch statements: use enums instead of constants, limiting the value of the case statement to a valid member of the enum. Use fallthrough in the last case statement to let the program continue to execute the following code. For switch statements without fallthrough, always add a default statement for error handling or provide default behavior.

See all articles

Hot AI Tools

Undresser.AI Undress

Undresser.AI Undress

AI-powered app for creating realistic nude photos

AI Clothes Remover

AI Clothes Remover

Online AI tool for removing clothes from photos.

Undress AI Tool

Undress AI Tool

Undress images for free

Clothoff.io

Clothoff.io

AI clothes remover

AI Hentai Generator

AI Hentai Generator

Generate AI Hentai for free.

Hot Article

R.E.P.O. Energy Crystals Explained and What They Do (Yellow Crystal)
3 weeks agoBy尊渡假赌尊渡假赌尊渡假赌
R.E.P.O. Best Graphic Settings
3 weeks agoBy尊渡假赌尊渡假赌尊渡假赌
R.E.P.O. How to Fix Audio if You Can't Hear Anyone
3 weeks agoBy尊渡假赌尊渡假赌尊渡假赌

Hot Tools

SAP NetWeaver Server Adapter for Eclipse

SAP NetWeaver Server Adapter for Eclipse

Integrate Eclipse with SAP NetWeaver application server.

mPDF

mPDF

mPDF is a PHP library that can generate PDF files from UTF-8 encoded HTML. The original author, Ian Back, wrote mPDF to output PDF files "on the fly" from his website and handle different languages. It is slower than original scripts like HTML2FPDF and produces larger files when using Unicode fonts, but supports CSS styles etc. and has a lot of enhancements. Supports almost all languages, including RTL (Arabic and Hebrew) and CJK (Chinese, Japanese and Korean). Supports nested block-level elements (such as P, DIV),

Notepad++7.3.1

Notepad++7.3.1

Easy-to-use and free code editor

SublimeText3 Mac version

SublimeText3 Mac version

God-level code editing software (SublimeText3)

Safe Exam Browser

Safe Exam Browser

Safe Exam Browser is a secure browser environment for taking online exams securely. This software turns any computer into a secure workstation. It controls access to any utility and prevents students from using unauthorized resources.