Home  >  Article  >  System Tutorial  >  Stop saying you don’t understand Linux memory management, 10 pictures will make it clear for you!

Stop saying you don’t understand Linux memory management, 10 pictures will make it clear for you!

WBOY
WBOYforward
2024-02-10 23:48:03667browse

Today we will study the memory management of Linux.

For business students who are proficient in CURD, memory management seems far away from us. But although this knowledge point is not popular (it is estimated that many people will not use it at all after learning it), it is definitely the foundation of the basics.

This is like the internal strength training in martial arts novels. You won’t see immediate results after learning it, but it will be of great benefit to your future development work because you will stand taller.

All the sample pictures in the article are drawn by me. Drawing pictures takes more time than coding, but everyone understands more intuitively by looking at pictures than words, so I still draw them. For students who need high-definition sample pictures, there are ways to obtain them at the end of the article.

To put it more utilitarian, if you inadvertently reveal that you know this knowledge during the interview and can tell you one, two, three, it may make the interviewer more interested in you, and you will be more likely to get a promotion, salary increase, or job. One step closer to reaching the pinnacle of life.

Premise agreement: The premise of discussing the technical content in this article is that the operating system environment is a 32-bit Linux system with x86 architecture.

Virtual address

Even in modern operating systems, memory is still a very precious resource in the computer. Just look at how many terabytes of solid-state drive your computer has, and then look at the memory size.

In order to fully utilize and manage system memory resources, Linux uses virtual memory management technology. Using virtual memory technology, each process has a 4GB virtual address space that does not interfere with each other.

Process initialization allocation and operations are based on this "virtual address". Only when the process needs to actually access memory resources will the mapping between the virtual address and the physical address be established and the physical memory page transferred.

To give an inappropriate analogy, this principle is actually the same as the current XX network disk. If your network disk space is 1TB, do you really think that you will be given such a large space in one go? That's still too young. Space is allocated to you only when you put things in it. You will be allocated as much actual space as you put. But it looks like you and your friend both have 1TB of space.

Benefits of virtual addresses

  • Prevent users from directly accessing physical memory addresses, prevent destructive operations, and protect the operating system.
  • Each process is allocated 4GB of virtual memory, allowing user programs to use a larger address space than the actual physical memory.

The 4GB process virtual address space is divided into two parts: "user space" and "kernel space".

Stop saying you don’t understand Linux memory management, 10 pictures will make it clear for you!

User space kernel space

Physical address

We already know in the above chapter that whether it is user space or kernel space, the addresses used are virtual addresses. When the process needs to actually access the memory, a "page fault exception" will be generated by the kernel's "request paging mechanism". into physical memory pages.

Convert the virtual address into the physical address of the memory, which involves using the MMU Memory Management Unit (Memory Management Unit) to segment the virtual address and page (segment page) address translation, about segmentation The specific process of paging and paging will not be described here. You can refer to any computer composition principles textbook for description.

Stop saying you don’t understand Linux memory management, 10 pictures will make it clear for you!

Segment page memory management address translation

Linux The kernel will divide physical memory into 3 management areas, which are:

ZONE_DMA

DMAMemory area. Contains memory page frames between 0MB and 16MB, which can be used by older ISA-based devices through DMA and map directly to the kernel's address space.

ZONE_NORMAL

Normal memory area. Contains memory page frames between 16MB~896MB, regular page frames, directly mapped to the kernel's address space.

ZONE_HIGHMEM

High-end memory area. Contains memory page frames above 896MB, which are not directly mapped. This part of the memory page frame can be accessed through permanent mapping and temporary mapping.

Stop saying you don’t understand Linux memory management, 10 pictures will make it clear for you!

Physical memory area division

User Space

What user processes can access is "user space". Each process has its own independent user space. The virtual address range is from 0x00000000 to 0xBFFFFFFF with a total capacity of 3G.

User processes can usually only access virtual addresses in user space, and can only access kernel space when executing inline operations or system calls.

Process and Memory

The user space occupied by the process (executed program) is divided into 5 different memory areas according to the principle of "address spaces with consistent access attributes are stored together". Access attributes refer to "readable, writable, executable, etc."

  • Code snippet

    The code segment is used to store the operation instructions of the executable file and the image of the executable program in the memory. The code segment needs to be protected from illegal modification at runtime, so only read operations are allowed, it is not writable.

  • Data segment

    The data segment is used to store initialized global variables in the executable file. In other words, it stores variables and global variables statically allocated by the program.

  • BSS segment

    The

    BSS section contains uninitialized global variables in the program, and all bss sections in the memory are set to zero.

  • Heap heap

    The heap is used to store memory segments that are dynamically allocated during process operation. Its size is not fixed and can be dynamically expanded or reduced. When a process calls a function such as malloc to allocate memory, the newly allocated memory is dynamically added to the heap (the heap is expanded); when a function such as free is used to release memory, the freed memory is removed from the heap (the heap is reduced).

  • Stack stack

    The stack is a local variable temporarily created by the user to store the program, that is, the variables defined in the function (but does not include variables declared by static, static means storing variables in the data segment). In addition, when a function is called, its parameters will also be pushed onto the stack of the process that initiated the call, and after the call is completed, the return value of the function will also be stored back on the stack. Due to the first-in-last-out feature of the stack, the stack is particularly convenient for saving/restoring the call scene. In this sense, we can think of the stack as a memory area that stores and exchanges temporary data.

The data segments, BSS segments, and heaps in the above memory areas are usually stored continuously in the memory and are continuous in location, while the code segments and stacks are often stored independently. Storage. In the i386 architecture, the stack expands downward and the heap expands upward. They are opposite to each other. Stop saying you don’t understand Linux memory management, 10 pictures will make it clear for you!

You can also use the size command under Linux to check the size of each memory area of ​​the compiled program:

[lemon ~]# size /usr/local/sbin/sshd
   text   data    bss    dec    hexfilename
1924532  12412 4268962363840 2411c0/usr/local/sbin/sshd

Kernel space

In a x86 32-bit system, the Linux kernel address space refers to the high-end memory address space where the virtual address starts from 0xC0000000 and ends at 0xFFFFFFFF, totaling# The capacity of ##1G includes kernel image, physical page table, driver, etc. running in the kernel space.

Stop saying you don’t understand Linux memory management, 10 pictures will make it clear for you!Kernel space subdivision area.

Direct mapping area

Direct mapping area

Direct Memory Region: Starting from the starting address of the kernel space, the maximum kernel space address range of 896M is the direct memory mapping area.

The "linear address" of 896MB in the direct mapping area is directly mapped to the first 896MB of the "physical address", which means that the linear address and the allocated physical address are continuous. The physical address corresponding to the linear address 0xC0000001 in the kernel address space is 0x00000001, and the difference between them is an offset PAGE_OFFSET = 0xC0000000

There is a linear conversion relationship between the linear address and the physical address in this area "Linear address = PAGE_OFFSET Physical address" You can also use the virt_to_phys() function to convert the linear address in the kernel virtual space Convert to physical address.

High-end memory linear address space

The kernel space linear address range is from 896M to 1G, and the address range with a capacity of 128MB is the high-end memory linear address space. Why is it called the high-end memory linear address space? Let me explain it to you:

As mentioned before, the total size of the kernel space is 1GB, and the linear address of 896MB starting from the starting address of the kernel space can be directly mapped to an address range with a physical address size of 896MB.

Taking a step back, even if the 1GB linear address of the kernel space is mapped to a physical address, it can only address a maximum of 1GB of physical memory address range.

How big is the memory stick you have now? Wake up, it’s almost 2023, and the memory of most PCs is greater than 1GB!

So, the kernel space takes out the last 128M address range and divides it into the following three high-end memory mapping areas to address the entire physical address range. This problem does not exist on 64-bit systems, because the available linear address space is much larger than the installable memory.

Dynamic memory mapping area

vmalloc Region This region is allocated by the kernel function vmalloc. Its characteristics are: the linear space is continuous, but the corresponding physical address space is not necessarily continuous. vmalloc The physical page corresponding to the allocated linear address may be in low-end memory or high-end memory.

Permanent memory mapping area

Persistent Kernel Mapping Region This region has access to high-end memory. The access method is to use alloc_page (_GFP_HIGHMEM) to allocate high-end memory pages or use the kmap function to map the allocated high-end memory to this area.

Fixed mapping area

Fixing kernel Mapping Region There is only a 4k isolation zone at the top of this region and 4G, and each of its address entries serves a specific purpose, such as ACPI_BASE etc.

Stop saying you don’t understand Linux memory management, 10 pictures will make it clear for you!

Kernel space physical memory mapping

Review

There is a lot to talk about above, so don’t rush into the next section. Before that, let’s review what we said above. If you read the above chapters carefully, I have drawn another picture here, and now you should have such a global picture of memory management in your mind.

Stop saying you don’t understand Linux memory management, 10 pictures will make it clear for you!

Full picture of kernel space and user space

Memory data structure

In order for the kernel to manage virtual memory in the system, the memory management data structure must be abstracted from it. Memory management operations such as "allocation, release, etc." are based on these data structure operations. Here are two data management virtual memory areas. structure.

User space memory data structure

In the previous chapter "Process and Memory" we mentioned that the Linux process can be divided into 5 different memory areas, namely: code segment, data segment, BSS, heap, stack, kernel management The way of these areas is to abstract these memory areas into memory management objects of vm_area_struct.

vm_area_struct is the basic management unit that describes the process address space. A process often needs multiple vm_area_struct to describe its user space virtual address. You need to use "linked list" and "red list" Black tree" to organize each vm_area_struct.

The linked list is used when all nodes need to be traversed, while the red-black tree is suitable for locating a specific memory area in the address space. The kernel uses both data structures in order to achieve high performance for various operations on memory areas.

Address management model of user space process:

Stop saying you don’t understand Linux memory management, 10 pictures will make it clear for you!

wm_arem_struct

Kernel space dynamically allocates memory data structure

In the kernel space chapter, we mentioned the "dynamic memory mapping area". This area is allocated by the kernel function vmalloc. Its characteristics are: the linear space is continuous, but the corresponding physical address space is not necessarily continuous. vmalloc The physical page corresponding to the allocated linear address may be in low-end memory or high-end memory.

vmalloc The allocated address is limited to between vmalloc_start and vmalloc_end. Each vmalloc allocated kernel virtual memory corresponds to a vm_struct structure. There is a 4k size anti-cross-border free area interval between different kernel space virtual addresses. district.

Same as the virtual address characteristics of user space, these virtual addresses do not have a simple mapping relationship with physical memory. They must be converted to physical addresses or physical pages through the kernel page table. They may not be mapped yet. When a page fault occurs, Only then are physical pages actually allocated.

Stop saying you don’t understand Linux memory management, 10 pictures will make it clear for you!

Dynamic memory mapping

in conclusion

LinuxMemory management is a very complex system. What is described in this article is just the tip of the iceberg. It will show you the full picture of memory management from a macro perspective, but generally speaking, this knowledge will be important when you chat with the interviewer. It is still useful when reading. Of course, I hope everyone can understand the deeper principles through reading.

This article can be used as an index-like study guide. When you want to study a certain point in depth, you can find the entry point in these chapters and the position of this knowledge point in the macroscopic view of memory management.

I also drew a lot of example diagrams during the creation of this article, which can be used as a knowledge index. Personally, I feel that looking at pictures is more clear than reading text. You can reply in the background of my official account "Backend Technology School" " Memory Management" to obtain the high-resolution original images of these images.

Old rules, thank you for reading. The purpose of the article is to share the understanding of knowledge. For technical articles, I will repeatedly verify them to ensure the accuracy to the greatest extent. If there are obvious flaws in the article, you are welcome to point it out. We will discuss it together. study. That’s it for today’s technology sharing. See you in the next issue.

The above is the detailed content of Stop saying you don’t understand Linux memory management, 10 pictures will make it clear for you!. For more information, please follow other related articles on the PHP Chinese website!

Statement:
This article is reproduced at:lxlinux.net. If there is any infringement, please contact admin@php.cn delete