Home  >  Article  >  System Tutorial  >  In-depth understanding of the Linux kernel: the mapping relationship between virtual address space and physical memory

In-depth understanding of the Linux kernel: the mapping relationship between virtual address space and physical memory

WBOY
WBOYOriginal
2024-06-03 09:28:441015browse

Video memory mapping

Chemical memory is collectively called addressable and dynamic random access memory (DRAM). Only the kernel has direct access to math memory.

The Linux kernel provides an independent virtual address space for each process, but this address space is continuous. In this way, the process can easily access the video memory, or more precisely, the virtual video memory. The interior of the virtual address space is divided into two parts: kernel space and user space.

linux 用户分配空间_linux磁盘分配空间_linux分配用户权限

When the process is in user mode, it can only access user space memory; only after entering kernel mode, it can access kernel space memory. In fact, the address space of each process includes the kernel space, but this kernel space is associated with the same chemical memory, that is, shared dynamic link libraries, shared graphics memory, etc. When the process switches to the kernel state, it can easily access the kernel space memory.

Not all virtual memory will be allocated chemical memory. Only the virtual memory actually used will be allocated chemical memory. However, the allocated chemical memory is managed through memory mapping. Video memory mapping is to map virtual video memory addresses to chemical video memory addresses. In order to complete the video memory mapping, the kernel maintains a page table for each process to record the mapping relationship between virtual addresses and chemical addresses.

linux磁盘分配空间_linux分配用户权限_linux 用户分配空间

The page table is actually stored in the CPU's video memory management unit MMU. In this way, under normal circumstances, the processor can directly find out the video memory to be accessed through the hardware. When the virtual address accessed by the process cannot be found in the page table, the system will form a page fault exception, enter the kernel space to allocate chemical memory, update the process page table, and finally return to the user space to resume the operation of the process.

The TLB (TranslationLookasideBuffer, Translation Lookaside Buffer) in CPU context switching is the cache of the page table in the MMU. Because the virtual address space of the process is independent of what Linux system is, and the TLB access rate is much faster than the MMU, therefore, by reducing the context switching of the process and the number of TLB refreshes, the TLB cache usage can be improved. Thereby improving the CPU's memory access performance.

MMU specifies the smallest unit of video memory mapping, which is a page, usually 4KB in size. In this way, each video memory mapping needs to be associated with a video memory space of 4KB or an integral multiple of 4KB.

A 4KB page will cause the entire page table to be very large. For example, 4GB/4KB = more than 1 million page table entries in a 32-bit system. In order to solve the problem of too many page table entries, Linux provides two mechanisms, namely multi-level page tables and huge pages (HugePage).

linux分配用户权限_linux 用户分配空间_linux磁盘分配空间

Multi-level page table is to divide the video memory into blocks for management, and change the original mapping relationship to block index and skew within the block. Because only a very small part of the virtual video memory space is generally used, the multi-level page table only saves those blocks that are in use, which can greatly reduce the number of page table entries. Linux uses a four-level page table to manage video memory pages. The virtual address is divided into 5 parts. The first 4 entries are used to select the page, and the last index indicates the skew within the page.

linux 用户分配空间_linux磁盘分配空间_linux分配用户权限

Large page is a larger block of video memory than a normal page. Common sizes are 2MB and 1GB. Large pages are generally used in processes that use a large amount of video memory, such as Oracle, DPDK, etc.

Through this mechanism, under the mapping of the page table, the process can access the math memory through the virtual address.

Virtual video memory space distribution

The top is the kernel space, the bottom is the user space memory, and the user space is divided into multiple different segments

linux磁盘分配空间_linux分配用户权限_linux 用户分配空间

User space video memory, there are 5 different video memory segments from low to high

1. Read-only section, including code and constants, etc.

linux磁盘分配空间_linux 用户分配空间_linux分配用户权限

2. Data segment, including panoramic variables, etc.

3. Heap, including dynamically allocated video memory, starts from low address and decreases downward

4. File mapping segments, including dynamic libraries, shared video memory, etc., start from high addresses and decrease upwards

5. Stack, including local variables and function call context, etc. The size of the stack is fixed, usually 8M

Among these 5 video memory segments, the heap and file mapped video memory are dynamically allocated. For example, using malloc or mmap() of the C standard library, you can dynamically allocate video memory in the heap and file mapped segments respectively. The video memory distribution of 64-bit systems is similar, but the video memory space is much larger

Video memory allocation and recycling

malloc() is the video memory allocation function provided by the C standard library. Corresponding to the system call, there are two implementation methods, namely brk() and mmap().

linux磁盘分配空间_linux 用户分配空间_linux分配用户权限

For small blocks of video memory (greater than 128K), the C standard library uses brk() to allocate, that is, the video memory is allocated by connecting the top position of the heap. This kind of video memory will not be returned to the system immediately after it is released, but will be cached so that it can be reused.

linux磁盘分配空间_linux分配用户权限_linux 用户分配空间

For large blocks of video memory (less than 128K), directly use the video memory mapping mmap() to allocate, that is, find a piece of free video memory in the file mapping segment and allocate it.

The similarities and differences between these two methods:

The caching of the brk() method can reduce the occurrence of page fault exceptions and improve the efficiency of video memory access. However, because this kind of video memory is not returned to the system, when the video memory is busy, frequent allocation and release of video memory will lead to video memory fragmentation.

The video memory allocated by the mmap() method will be directly returned to the system when released, so a page fault exception will occur every time mmap occurs. When the video memory is busy, frequent video memory allocation will cause a large number of page fault exceptions, reducing the management burden of the kernel. This is also the reason why malloc only uses mmap for large blocks of video memory.

It should be noted that when these two calls occur, although the video memory is not actually allocated. This kind of video memory is only allocated when it is accessed for the first time, that is, it enters the kernel through a page fault exception, and then the kernel allocates the video memory.

In general, Linux uses a partner system to manage video memory allocation. As we mentioned above, this kind of video memory is managed in units of pages in the MMU. The partner system also manages the video memory in units of pages, and will reduce the fragmentation of video memory through the merging of adjacent pages (for example, Video memory fragmentation caused by brk method).

But in actual system operation, there will be a large number of objects smaller than a page, such as less than 1K. If separate pages are allocated for them, a large amount of video memory will be wasted. How to allocate video memory?

In user spaceLinux user allocated space, the video memory allocated by malloc through brk() is not immediately returned to the system when released, but is cached and used again.

In the kernel space, Linux manages small video memory through the slab allocator. You can think of slab as a cache built on the partner system. Its main function is to allocate and release small objects in the kernel.

linux 用户分配空间_linux磁盘分配空间_linux分配用户权限

Video memory recycling: For video memory, if you only allocate without releasing it, it will lead to video memory leakage and even use up the system video memory. Therefore, after the application uses up the video memory, it still needs to call free() or unmap() to release the unused video memory. In fact, the system will not let a process use up all the video memory. When it is found that the video memory is tight, the system will also use a series of mechanisms to reclaim the video memory, such as the following three methods:

(1) Recycle the cache, for example, use the LRU (LeastRecentlyUsed) algorithm to recycle the least recently used video memory pages.

(2) Recycle the infrequently accessed video memory and transfer the infrequently used video memory directly to the c drive through the swap partition (Swap). Although Swap uses a piece of C drive space as video memory. It can store data that is temporarily unused by the process into the c drive (this process is called swapping out). When the process accesses those video memories, it can then read this data from the c drive into the video memory (this process is called swapping in). Swap increases the available video memory of the system, but generally Swap occurs only when the video memory is insufficient. And because the read and write speed of the C drive is much slower than that of the video memory, Swap will cause serious video memory performance problems.

(3) Kill processes. When video memory is tight, the system will directly kill processes that occupy a large amount of video memory through OOM (OutofMemory, a protection mechanism of the kernel). OOM monitors the video memory usage of the process, but uses oom_score to score the video memory usage of each process:

The greater the graphics memory consumed by a process, the greater the oom_score;

The more CPU a process takes up, the smaller the oom_score will be.

In this way, the larger the oom_score of the process, the more video memory is consumed, and the easier it is to be killed by OOM, which can better protect the system.

In fact, for actual work needs, the administrator can automatically set the oom_adj of the process through the /proc file system, thus adjusting the oom_score of the process. The range of oom_adj is [-17,15]. The larger the value, the easier the process is to be killed by OOM; the smaller the value, the less likely the process is to be killed by OOM. -17 means that OOM is strictly prohibited. If you use the following command, you can adjust the oom_adj of the sshd process to -16, so that the sshd process is not easily killed by OOM.

echo-16>/proc/$(pidofsshd)/oom_adj

linux磁盘分配空间_linux 用户分配空间_linux分配用户权限

linux磁盘分配空间_linux 用户分配空间_linux分配用户权限

buffer and cache

Buffer and cache in the free command both represent cache, but their uses are different

1. Buffer is the video memory used by the kernel buffer, corresponding to the Buffer value in /proc/meminfo

2. Cache is the video memory used by the kernel page cache and slab, corresponding to the sum of Cache and SReclaimable in /proc/meminfo

Simply put, Buffer is a cache of c drive data, and Cache is a cache of file data. They will be used in both read requests and write requests.

Cache (cache) is designed from the perspective of the CPU to increase the data exchange rate between the CPU and the video memory, such as the first-level cache, second-level cache, and third-level cache that we usually see. The instructions and data used by the CPU to execute the program are all targeted at the video memory, that is, obtained from the video memory. Because the read and write speed of the video memory is slow, in order to increase the data exchange rate between the CPU and the video memory, the cache is reduced between the CPU and the video memory. Its speed is faster than the video memoryLinux User Allocation Space, and the cost is high. And because too many integrated circuits cannot be integrated into the CPU, the cache is usually relatively small. Later, in order to further increase the speed, Intel and other companies reduced the level 2 cache and even the level 5 cache, which is designed according to the principle of locality of the program. , that is, the instructions executed by the CPU and the data accessed are often concentrated in a certain block, so after loading this piece of content into the cache, the CPU does not need to access the video memory, which increases the access rate. In fact, if there is no content required by the CPU in the cache, the video memory still needs to be accessed.

Considering from the perspective of video memory reading and c disk reading, cache can be understood as the operating system using more video memory to cache data that may be accessed again in order to achieve higher reading efficiency.

Buffers are designed to increase the rate of data exchange between video memory and hard disk (or other I/O devices). Centralize scattered write operations to reduce C drive fragmentation and repeated hard disk seeks, thereby improving system performance. Linux has a daemon process that regularly clears the buffer contents (that is, writes to the c drive), and the buffer can also be cleared automatically through the sync command.

Simply put, the buffer is about to be written to the c drive, and the cache is read from the c drive. Buffers are allocated by various processes and are used in aspects such as input queues. A simple counterexample is that a process requires multiple arrays to be read in. Before all arrays are read in completely, the process places the originally read arrays in the buffer and saves them.

Cache is often used for I/O requests on the c drive. If multiple processes want to access a certain file, the file is cached to facilitate the last access, which can improve system performance.

The above is the detailed content of In-depth understanding of the Linux kernel: the mapping relationship between virtual address space and physical memory. For more information, please follow other related articles on the PHP Chinese website!

Statement:
The content of this article is voluntarily contributed by netizens, and the copyright belongs to the original author. This site does not assume corresponding legal responsibility. If you find any content suspected of plagiarism or infringement, please contact admin@php.cn