LINUX

Memory mapping mmap under Linux: principles, usage and advantages

王林

Feb 10, 2024 am 10:00 AM

linuxlinux tutoriallinux systemlinux commandshell scriptembeddedlinuxGetting started with linuxlinux learning

Memory mapping mmap is an important memory management mechanism in Linux systems. It allows user space processes to directly access physical memory or files without going through system calls or copying data. This can improve memory utilization and access efficiency, saving system resources and time. But, do you really understand how mmap works? Do you know how to use mmap in Linux? Do you know the advantages and limitations of mmap? This article will introduce you to the relevant knowledge of memory mapping mmap under Linux in detail, so that you can better use and understand this powerful memory management tool under Linux.

2. Basic functions

The mmap function is a system call under unix/linux. For details, please refer to Section 12.2 of Volume 2 of "Unix Netword Programming".

Themmap system call is not entirely designed for use with shared memory. It itself provides a different access method to ordinary files. The process can operate on ordinary files like reading and writing memory. The shared memory IPC of Posix or System V is purely used for sharing purposes. Of course, mmap() implementation of shared memory is also one of its main applications.

<code style="display: -webkit-box;font-family: Operator Mono, Consolas, Monaco, Menlo, monospace;border-radius: 0px;font-size: 12px"> **mmap系统调用使得进程之间通过映射同一个普通文件实现共享内存。普通文件被映射到进程地址空间后，进程可以像访问普通内存一样对文件进行访问，不必再调用read()，write（）等操作。mmap并不分配空间, 只是将文件映射到调用进程的地址空间里（但是会占掉你的 virutal memory）, 然后你就可以用memcpy等操作写文件, 而不用write()了.写完后，内存中的内容并不会立即更新到文件中，而是有一段时间的延迟，你可以调用msync()来显式同步一下, 这样你所写的内容就能立即保存到文件里了.这点应该和驱动相关。 不过通过mmap来写文件这种方式没办法增加文件的长度, 因为要映射的长度在调用mmap()的时候就决定了.如果想取消内存映射，可以调用munmap()来取消内存映射**
</code>

void * mmap(void *start, size_t length, int prot , int flags, int fd, off_t offset)

Mmap is used to map files into memory space. Simply put, mmap is to make an image of the contents of a file in memory. After the mapping is successful, the user's modifications to this memory area can be directly reflected in the kernel space. Similarly, the kernel space's modifications to this area are also directly reflected in the user space. Then it is very efficient for operations that require a large amount of data transmission between the kernel space and user space.

principle

First of all, the word "mapping" has the same meaning as the "one-to-one mapping" mentioned in mathematics class, which is to establish a one-to-one correspondence. Here it mainly refers to the location of the file on the hard disk and the logical address of the process. One-to-one correspondence between areas of the same size in space, as shown in process 1 in Figure 1. This correspondence is purely a logical concept and does not exist physically. The reason is that the logical address space of the process itself does not exist. In the process of memory mapping, there is no actual data copy. The file is not loaded into the memory, but is logically put into the memory. Specific to the code, the relevant data structure (struct address_space) is established and initialized. This process There is a system call mmap() to implement, so the efficiency of establishing memory mapping is very high.

Figure 1. Memory mapping principle

Since the memory mapping is established without actual data copying, how can the process finally directly access the files on the hard disk through memory operations? That depends on several related processes after memory mapping.

mmap() will return a pointer ptr, which points to an address in the process's logical address space. In this way, the process no longer needs to call read or write to read and write the file, but only needs to use ptr to operate the file. However, ptr points to a logical address. To operate the data therein, the logical address must be converted into a physical address through the MMU, as shown in process 2 in Figure 1. This process has nothing to do with memory mapping.

As mentioned before, establishing the memory mapping does not actually copy the data. At this time, the MMU cannot find the physical address corresponding to ptr in the address mapping table. That is, the MMU fails and a page fault interrupt will be generated. The interrupt response function of the page interrupt will search for the corresponding page in the swap. If it cannot be found (that is, the file has never been read into the memory), the mapping relationship established by mmap() will be used to copy the page from the hard disk. The file is read into physical memory, as shown in process 3 in Figure 1. This process has nothing to do with memory mapping.

If it is found that the physical memory is not enough when copying data, the temporarily unused physical pages will be swapped to the hard disk through the virtual memory mechanism (swap), as shown in process 4 in Figure 1. This process also has nothing to do with memory mapping.

效率

从代码层面上看，从硬盘上将文件读入内存，都要经过文件系统进行数据拷贝，并且数据拷贝操作是由文件系统和硬件驱动实现的，理论上来说，拷贝数据的效率是一样的。但是通过内存映射的方法访问硬盘上的文件，效率要比read和write系统调用高，这是为什么呢？原因是read()是系统调用，其中进行了数据拷贝，它首先将文件内容从硬盘拷贝到内核空间的一个缓冲区，如图2中过程1，然后再将这些数据拷贝到用户空间，如图2中过程2，在这个过程中，实际上完成了两次数据拷贝；而mmap()也是系统调用，如前所述，mmap()中没有进行数据拷贝，真正的数据拷贝是在缺页中断处理时进行的，由于mmap()将文件直接映射到用户空间，所以中断处理函数根据这个映射关系，直接将文件从硬盘拷贝到用户空间，只进行了一次数据拷贝。因此，内存映射的效率要比read/write效率高。

图2.read系统调用原理

下面这个程序，通过read和mmap两种方法分别对硬盘上一个名为“mmap_test”的文件进行操作，文件中存有10000个整数，程序两次使用不同的方法将它们读出，加1，再写回硬盘。通过对比可以看出，read消耗的时间将近是mmap的两到三倍。

  1 #include
  2 
  3 #include
  4 
  5 #include
  6 
  7 #include
  8 
  9 #include
 10 
 11 #include
 12 
 13 #include
 14 
 15 #include
 16 
 17 #include
 18 
 19  
 20 
 21 #define MAX 10000
 22 
 23  
 24 
 25 int main()
 26 
 27 {
 28 
 29 int i=0;
 30 
 31 int count=0, fd=0;
 32 
 33 struct timeval tv1, tv2;
 34 
 35 int *array = (int *)malloc( sizeof(int)*MAX );
 36 
 37  
 38 
 39 /*read*/
 40 
 41  
 42 
 43 gettimeofday( &tv1, NULL );
 44 
 45 fd = open( "mmap_test", O_RDWR );
 46 
 47 if( sizeof(int)*MAX != read( fd, (void *)array, sizeof(int)*MAX ) )
 48 
 49 {
 50 
 51 printf( "Reading data failed.../n" );
 52 
 53 return -1;
 54 
 55 }
 56 
 57 for( i=0; iif( sizeof(int)*MAX != write( fd, (void *)array, sizeof(int)*MAX ) )
 64 
 65 {
 66 
 67 printf( "Writing data failed.../n" );
 68 
 69 return -1;
 70 
 71 }
 72 
 73 free( array );
 74 
 75 close( fd );
 76 
 77 gettimeofday( &tv2, NULL );
 78 
 79 printf( "Time of read/write: %dms/n", tv2.tv_usec-tv1.tv_usec );
 80 
 81  
 82 
 83 /*mmap*/
 84 
 85  
 86 
 87 gettimeofday( &tv1, NULL );
 88 
 89 fd = open( "mmap_test", O_RDWR );
 90 
 91 array = mmap( NULL, sizeof(int)*MAX, PROT_READ|PROT_WRITE, MAP_SHARED, fd, 0 );
 92 
 93 for( i=0; iprintf( "Time of mmap: %dms/n", tv2.tv_usec-tv1.tv_usec );
110 
111  
112 
113 return 0;
114 
115 }

通过本文，你应该对 Linux 下的内存映射 mmap 有了一个深入的了解，知道了它的定义、原理、用法和优势。你也应该明白了 mmap 的适用场景和注意事项，以及如何在 Linux 下正确地使用 mmap。我们建议你在需要高效地访问物理内存或者文件时，使用 mmap 来提高程序的性能和可移植性。同时，我们也提醒你在使用 mmap 时要注意一些潜在的风险和问题，如同步、保护、错误处理等。希望本文能够帮助你更好地使用 Linux 系统，让你在 Linux 下享受内存映射的便利和快感。

The above is the detailed content of Memory mapping mmap under Linux: principles, usage and advantages. For more information, please follow other related articles on the PHP Chinese website!

Statement

This article is reproduced at:良许Linux教程网. If there is any infringement, please contact admin@php.cn delete

What are the differences in virtualization support between Linux and Windows?Apr 22, 2025 pm 06:09 PM

The main differences between Linux and Windows in virtualization support are: 1) Linux provides KVM and Xen, with outstanding performance and flexibility, suitable for high customization environments; 2) Windows supports virtualization through Hyper-V, with a friendly interface, and is closely integrated with the Microsoft ecosystem, suitable for enterprises that rely on Microsoft software.

What are the main tasks of a Linux system administrator?Apr 19, 2025 am 12:23 AM

The main tasks of Linux system administrators include system monitoring and performance tuning, user management, software package management, security management and backup, troubleshooting and resolution, performance optimization and best practices. 1. Use top, htop and other tools to monitor system performance and tune it. 2. Manage user accounts and permissions through useradd commands and other commands. 3. Use apt and yum to manage software packages to ensure system updates and security. 4. Configure a firewall, monitor logs, and perform data backup to ensure system security. 5. Troubleshoot and resolve through log analysis and tool use. 6. Optimize kernel parameters and application configuration, and follow best practices to improve system performance and stability.

Is it hard to learn Linux?Apr 18, 2025 am 12:23 AM

Learning Linux is not difficult. 1.Linux is an open source operating system based on Unix and is widely used in servers, embedded systems and personal computers. 2. Understanding file system and permission management is the key. The file system is hierarchical, and permissions include reading, writing and execution. 3. Package management systems such as apt and dnf make software management convenient. 4. Process management is implemented through ps and top commands. 5. Start learning from basic commands such as mkdir, cd, touch and nano, and then try advanced usage such as shell scripts and text processing. 6. Common errors such as permission problems can be solved through sudo and chmod. 7. Performance optimization suggestions include using htop to monitor resources, cleaning unnecessary files, and using sy

What is the salary of Linux administrator?Apr 17, 2025 am 12:24 AM

The average annual salary of Linux administrators is $75,000 to $95,000 in the United States and €40,000 to €60,000 in Europe. To increase salary, you can: 1. Continuously learn new technologies, such as cloud computing and container technology; 2. Accumulate project experience and establish Portfolio; 3. Establish a professional network and expand your network.

What is the main purpose of Linux?Apr 16, 2025 am 12:19 AM

The main uses of Linux include: 1. Server operating system, 2. Embedded system, 3. Desktop operating system, 4. Development and testing environment. Linux excels in these areas, providing stability, security and efficient development tools.

Does the internet run on Linux?Apr 14, 2025 am 12:03 AM

The Internet does not rely on a single operating system, but Linux plays an important role in it. Linux is widely used in servers and network devices and is popular for its stability, security and scalability.

What are Linux operations?Apr 13, 2025 am 12:20 AM

The core of the Linux operating system is its command line interface, which can perform various operations through the command line. 1. File and directory operations use ls, cd, mkdir, rm and other commands to manage files and directories. 2. User and permission management ensures system security and resource allocation through useradd, passwd, chmod and other commands. 3. Process management uses ps, kill and other commands to monitor and control system processes. 4. Network operations include ping, ifconfig, ssh and other commands to configure and manage network connections. 5. System monitoring and maintenance use commands such as top, df, du to understand the system's operating status and resource usage.

Boost Productivity with Custom Command Shortcuts Using Linux AliasesApr 12, 2025 am 11:43 AM

Introduction Linux is a powerful operating system favored by developers, system administrators, and power users due to its flexibility and efficiency. However, frequently using long and complex commands can be tedious and er

See all articles

Hot AI Tools

Undresser.AI Undress

AI-powered app for creating realistic nude photos

AI Clothes Remover

Online AI tool for removing clothes from photos.

Undress AI Tool

Undress images for free

Clothoff.io

AI clothes remover

Video Face Swap

Swap faces in any video effortlessly with our completely free AI face swap tool!

Hot Article

Assassin's Creed Shadows: Seashell Riddle Solution

3 weeks agoByDDD

What's New in Windows 11 KB5054979 & How to Fix Update Issues

2 weeks agoByDDD

Where to find the Crane Control Keycard in Atomfall

3 weeks agoByDDD

Roblox: Dead Rails - How To Complete Every Challenge

3 weeks agoByDDD

Atomfall guide: item locations, quest guides, and tips

4 weeks agoByDDD

Hot Tools

ZendStudio 13.5.1 Mac

Powerful PHP integrated development environment

mPDF

mPDF is a PHP library that can generate PDF files from UTF-8 encoded HTML. The original author, Ian Back, wrote mPDF to output PDF files "on the fly" from his website and handle different languages. It is slower than original scripts like HTML2FPDF and produces larger files when using Unicode fonts, but supports CSS styles etc. and has a lot of enhancements. Supports almost all languages, including RTL (Arabic and Hebrew) and CJK (Chinese, Japanese and Korean). Supports nested block-level elements (such as P, DIV),