


A few days ago, another colleague came to ask another question related to virtual memory. I discovered that my understanding of virtual memory was not deep enough, and some concepts were somewhat contradictory. So I went through the information and reorganized this knowledge, hoping to use it more smoothly next time.

Not long ago, there was another sharing in the group that I was looking forward to: "Linux Virtual Memory". It was when we were working overtime one night and we were discussing the concept of virtual memory. The leader discovered that several colleagues had unclear understanding of virtual memory, so he specially selected the topic for this student (laughing).
I learned some concepts about operating systems before, mainly because after graduation I was annoyed at the four years of wasting my time in college. I felt a little sorry for my background in computer science, so I took the time after work to read the operating system of Harbin Institute of Technology in the NetEase Cloud Classroom. In the open class, I also read a book "Linux Kernel Design and Implementation" that talks about operating systems relatively briefly. When I wrote a simple server in C last year, I also learned more about the underlying knowledge of the system. Thanks to this knowledge, I feel more in control of the application layer knowledge, and it also helped me when troubleshooting the problem last time.
A few days ago, another colleague came to ask another question related to virtual memory. I discovered that my understanding of virtual memory was not deep enough, and some concepts were somewhat contradictory. So I went through the information and reorganized this knowledge, hoping to use it more smoothly next time.
The origin of virtual memory
There is no doubt that virtual memory is definitely one of the most important concepts in the operating system. I think it is mainly due to the important "strategic position" of memory. The CPU is too fast, but has a small capacity and a single function. Other I/O hardware supports various fancy functions, but they are too slow compared to the CPU. So a lubricant is needed between them as a buffer, and this is where memory comes into play.
In modern operating systems, multitasking is standard. Multi-tasking parallelism greatly improves CPU utilization, but it also leads to conflicts in memory operations between multiple processes. The concept of virtual memory was proposed to solve this problem.

The above picture is the simplest and most intuitive explanation of virtual memory.
The operating system has a piece of physical memory (the middle part) and two processes (actually more) P1 and P2. The operating system secretly tells P1 and P2 respectively that my entire memory is yours, use it as you like. , enough care. But in fact, the operating system just gave them a big pie. These memories were said to be given to P1 and P2, but in fact they were only given a serial number. Only when P1 and P2 actually start to use these memories, the system starts to move around and piece together the various blocks for the process. P2 thinks that it is using A memory, but in fact it has been quietly redirected to the real B by the system. Even when P1 and P2 share C memory, they don't know.
This method of deceiving the process of the operating system is virtual memory. For processes such as P1 and P2, they all think that they occupy the entire memory, and they do not know and do not need to care about which address of the physical memory they use.
Paging and page tables
Virtual memory is a concept in the operating system. To the operating system, virtual memory is a comparison table. When P1 obtains the data in A memory, it should go to the A address of the physical memory and look for it in the B memory. The data should go to the C address of physical memory.
We know that the basic unit in the system is Byte. If each Byte of virtual memory is mapped to the address of physical memory, each entry requires at least 8 bytes (32-bit virtual address -> 32-bit physical address), in the case of 4G memory, 32GB of space is needed to store the comparison table, so this table is too big to fit even the real physical address, so the operating system introduces the concept of page.
When the system starts, the operating system divides the entire physical memory into pages in units of 4K. When memory is allocated in the future, the unit is page, so the mapping table of virtual memory pages corresponding to physical memory pages is greatly reduced. 4G memory only requires an 8M mapping table. Some processes do not use virtual memory. There is no need to save the mapping relationship, and Linux also designs a multi-level page table for large memory, which can advance a page to reduce memory consumption. The mapping table of operating system virtual memory to physical memory is called a page table.
Memory addressing and allocation
We know that through the virtual memory mechanism, each process thinks that it occupies all the memory. When the process accesses the memory, the operating system will convert the virtual memory address provided by the process into a physical address, and then obtain the data at the corresponding physical address. . There is a piece of hardware in the CPU, the memory management unit MMU (Memory Management Unit), which is specially used to translate virtual memory addresses. The CPU also sets a cache strategy for page table addressing. Due to the locality of the program, its cache hit rate can reach 98%.
The above situation is the mapping of virtual address to physical address in the page table memory. If the physical address accessed by the process has not been allocated, the system will generate a page fault interrupt. During the interrupt processing, the system switches to the kernel state. The process virtual address assigns a physical address.
Function
Virtual memory not only solves the problem of memory access conflicts between multiple processes through memory address translation, but also brings more benefits.
Process Memory Management
It helps the process to manage memory, mainly reflected in:
Memory integrity: Due to the "deception" of virtual memory on the process, each process thinks that the memory it obtains is a continuous address. When we write an application, we don't need to consider the allocation of large blocks of address. We always think that the system has enough large blocks of memory.
Security: Since when a process accesses memory, it must be addressed through the page table. The operating system can implement memory permission control by adding various access permission flags to each item in the page table.
data sharing
It is easier to share memory and data through virtual memory.
When a process loads a system library, it always allocates a piece of memory first and loads the library file on the disk into this memory. When using physical memory directly, because the physical memory address is unique, even if the system finds that the same library is in It is loaded twice in the system, but the loading memory specified by each process is different, and the system is unable to do anything.
When using virtual memory, the system only needs to point the virtual memory address of the process to the physical memory address where the library file is located. As shown in the figure above, the B addresses of processes P1 and P2 both point to physical address C.
It is also very simple to use shared memory by using virtual memory. The system only needs to point the virtual memory address of each process to the shared memory address allocated by the system.
SWAP
Virtual memory allows the process to "expand" memory.
We mentioned earlier that virtual memory allocates physical memory to the process through page fault interrupts. Memory is always limited. What if all physical memory is occupied?
Linux proposes the concept of SWAP. SWAP partitions can be used in Linux. When physical memory is allocated but the available memory is insufficient, the temporarily unused memory data will be placed on the disk first, allowing processes in need to use it first, and then wait for the process to use it again. When the data needs to be used, the data is loaded into the memory. Through this "swapping" technology, Linux can allow the process to use more memory.
common problem
I also had a lot of questions when understanding virtual memory.
32-bit and 64-bit
The most common problem is 32-bit and 64-bit.
CPU accesses memory through the physical bus, so the range of access addresses is limited by the number of machine buses. On a 32-bit machine, there are 32 buses. Each bus has two potentials, high and low, representing bits 1 and 0 respectively. , then the maximum accessible address is 2^32bit = 4GB, so it is invalid to insert memory larger than 4G on a 32-bit machine, and the CPU cannot access memory larger than 4G.
But 64-bit machines do not have a 64-bit bus, and their maximum memory is limited by the operating system. Linux currently supports a maximum of 256G memory.
According to the concept of virtual memory, it is okay to run 64-bit software on a 32-bit system. However, due to the system's structural design of virtual memory addresses, 64-bit virtual addresses cannot be used in 32-bit systems.
Directly operate physical memory
The operating system uses virtual memory. What should we do if we want to directly operate the memory?
Linux will map each device to a file in the /dev/ directory. We can directly operate the hardware through these device files, and memory is no exception. In Linux, the memory settings are mapped to /dev/mem, and the root user can directly operate the memory by reading and writing this file.
The JVM process occupies too much virtual memory
When using TOP to view system performance, we will find that in the VIRT column, the Java process will occupy a large amount of virtual memory.

The reason for this problem is that Java uses Glibc's Arena memory pool to allocate a large amount of virtual memory and not use it. In addition, files read by Java will also be mapped into virtual memory. Under the default configuration of the virtual machine, each Java thread stack will occupy 1M of virtual memory. For details, you can check why multi-threaded programs under Linux consume so much virtual memory.
The actual physical memory occupied depends on the RES (resident) column. The value of this column is the size that is actually mapped to the physical memory.
Common management commands
We can also manage Linux virtual memory ourselves.
View system memory status
There are many ways to check the system memory status. Free, vmstat and other commands can output the current system memory status. It should be noted that the available memory is not just the free column. Due to the lazy characteristics of the operating system, a large number of buffer/ The cache will not be cleared immediately after the process is no longer used. If the process that previously used them can continue to be used again, they can also be used when necessary.
In addition, you can use cat /proc/meminfo to view the details of system memory usage, including dirty page status, etc. Details can be found at: /PROC/MEMINFO Mystery.
pmap
If you want to view the virtual memory distribution of a process individually, you can use the pmap pid command, which will list the occupancy of each virtual memory segment from low address to high address.
You can add -XX parameters to output more detailed information.
Modify memory configuration
We can also modify the Linux system configuration, use sysctl vm [-options] CONFIG or directly read and write files in the /proc/sys/vm/ directory to view and modify the configuration.
SWAP Operation
The SWAP feature of virtual memory is not always beneficial. Allowing the process to continuously exchange large amounts of data between memory and disk will greatly occupy the CPU and reduce system operating efficiency, so sometimes we do not want to use swap.
We can modify vm.swappiness=0 to set the memory to use swap as little as possible, or simply use the swapoff command to disable SWAP.
summary
The concept of virtual memory is very easy to understand, but it will derive a series of very complex knowledge. This article only talks about some basic principles and skips many details, such as the use of mid-segment registers in virtual memory addressing, the operating system's use of virtual memory to enhance cache and buffer applications, etc. If there is an opportunity, I will talk about it separately.
The above is the detailed content of In-depth analysis of Linux virtual memory principles, say goodbye to insufficient memory problems!. For more information, please follow other related articles on the PHP Chinese website!

linux设备节点是应用程序和设备驱动程序沟通的一个桥梁;设备节点被创建在“/dev”,是连接内核与用户层的枢纽,相当于硬盘的inode一样的东西,记录了硬件设备的位置和信息。设备节点使用户可以与内核进行硬件的沟通,读写设备以及其他的操作。

区别:1、open是UNIX系统调用函数,而fopen是ANSIC标准中的C语言库函数;2、open的移植性没fopen好;3、fopen只能操纵普通正规文件,而open可以操作普通文件、网络套接字等;4、open无缓冲,fopen有缓冲。

端口映射又称端口转发,是指将外部主机的IP地址的端口映射到Intranet中的一台计算机,当用户访问外网IP的这个端口时,服务器自动将请求映射到对应局域网内部的机器上;可以通过使用动态或固定的公共网络IP路由ADSL宽带路由器来实现。

在linux中,交叉编译是指在一个平台上生成另一个平台上的可执行代码,即编译源代码的平台和执行源代码编译后程序的平台是两个不同的平台。使用交叉编译的原因:1、目标系统没有能力在其上进行本地编译;2、有能力进行源代码编译的平台与目标平台不同。

在linux中,eof是自定义终止符,是“END Of File”的缩写;因为是自定义的终止符,所以eof就不是固定的,可以随意的设置别名,linux中按“ctrl+d”就代表eof,eof一般会配合cat命令用于多行文本输出,指文件末尾。

linux查询mac地址的方法:1、打开系统,在桌面中点击鼠标右键,选择“打开终端”;2、在终端中,执行“ifconfig”命令,查看输出结果,在输出信息第四行中紧跟“ether”单词后的字符串就是mac地址。

在linux中,可以利用“rpm -qa pcre”命令判断pcre是否安装;rpm命令专门用于管理各项套件,使用该命令后,若结果中出现pcre的版本信息,则表示pcre已经安装,若没有出现版本信息,则表示没有安装pcre。

在linux中,rpc是远程过程调用的意思,是Reomote Procedure Call的缩写,特指一种隐藏了过程调用时实际通信细节的IPC方法;linux中通过RPC可以充分利用非共享内存的多处理器环境,提高系统资源的利用率。


Hot AI Tools

Undresser.AI Undress
AI-powered app for creating realistic nude photos

AI Clothes Remover
Online AI tool for removing clothes from photos.

Undress AI Tool
Undress images for free

Clothoff.io
AI clothes remover

AI Hentai Generator
Generate AI Hentai for free.

Hot Article

Hot Tools

SublimeText3 Mac version
God-level code editing software (SublimeText3)

SublimeText3 Linux new version
SublimeText3 Linux latest version

SecLists
SecLists is the ultimate security tester's companion. It is a collection of various types of lists that are frequently used during security assessments, all in one place. SecLists helps make security testing more efficient and productive by conveniently providing all the lists a security tester might need. List types include usernames, passwords, URLs, fuzzing payloads, sensitive data patterns, web shells, and more. The tester can simply pull this repository onto a new test machine and he will have access to every type of list he needs.

WebStorm Mac version
Useful JavaScript development tools

SublimeText3 English version
Recommended: Win version, supports code prompts!
