Memory management in Linux protected mode-Nginx-php.cn

Home

Operation and Maintenance

Nginx

Memory management in Linux protected mode

王林

Jul 06, 2023 pm 03:20 PM

linuxmodelProtect

We know that memory can be regarded as a very large array. If we want to find an element in the memory, it will be specified by the subscript of the array. The same is true for memory, but there is a premise that the array is composed of It is composed of an ordered set of bytes. In this ordered byte array, each byte has a unique address. This address is also called a memory address.

There are many objects stored in the memory. Each object is composed of different bytes, such as a char object, a byte object, an int object, etc. They are all divided into various locations in the memory. The operation of the CPU to locate the addresses of these objects in memory is called memory addressing. The memory bus width determines how many bits of memory address can be addressed, starting from address 0. Since 80X86 is 32 bits, the bus width is also 32 bits, so there are 2^32 memory addresses in total, so a total of 4GB of memory addresses can be stored. Multiple byte data types, such as int, long, and double, can be extracted through consecutive memory addresses.

Although objects can be addressed, the byte order in which these objects are stored is different. There are two storage methods, namely big-endian and little-endian.

For example, there is an int type object located at address 0x100, and its hexadecimal value is 0x01234567. I will draw you a picture and you will understand the difference between the two storage orders.

Linux 保护模式下的内存管理

This is actually easy to understand. The int data type of 0x01234567 can be split into 01 23 45 67 bytes, and 01 is the high bit and 67 is the low bit, so it can be explained The storage order of little-endian and big-endian: that is, little-endian is low-end first, while big-endian is high-end first. The difference between big-endian and little-endian is only the storage order, and has nothing to do with the number of digits and numerical values of the object. Most Intel machines use little-endian mode, so 80X86 is also little-endian storage, while most IBM and Oracle machines use big-endian storage.

Since the computer cannot directly address all the data in the memory at once, because it is relatively too large, the memory is generally segmented. This involves a question: why should the memory be segmented? part. I just gave a general introduction above.

Why should memory be segmented?

https://www.php.cn/link/d005ce7aeef46bd18515f783fb8e87fa

Using the segmentation mechanism, the memory space is divided into linear areas, and each linear area can pass The segment base address plus the offset within the segment are used to locate the segment. The segment base address part is specified by a 16-bit segment selector, of which 14 bits can select 2^14th power, that is, 16384 segments. The offset address part within the segment is specified using a 32-bit value, so the address within the segment can be 0 - 4G, the maximum length of a segment is 4 GB, which corresponds to the 4 GB memory address mentioned above. A 48-bit address or long pointer consisting of a 16-bit segment and an offset within a 32-bit segment is called a logical address, and the logical address is the virtual address.

There are six special registers in the X86 architecture used to store the segment base address. They are CS, DS, ES, SS, FS and GS. CS is used to address the code segment, SS is used to address the stack segment, and other registers are used to address the data segment. The segment addressed by the CS at any given moment is called the current code segment. The offset address of the next instruction to be executed in the current code segment already exists in the EIP register. At this time, the segment base address:offset address can be expressed as CS:EIP.

The segment addressed by the segment register SS is called the current stack segment. The top of the stack is given by the ESP register. At any time, SS:ESP points to the top of the stack, and there are no exceptions. The other four are general data Segment register, when there is no data segment in the instruction by default, it is given by DS.

Address Translation

Usually, a complete memory management system consists of two components: access protection and address translation. Access protection is to prevent one application from accessing a memory address that is used by another program; address translation is to provide a dynamic address allocation method for different applications. Access protection and address translation complement each other.

Address translation usually uses memory blocks as the basic unit. Here is an explanation of what a block is. As we all know, in Linux, everything is a file, and a file is composed of blocks. A block is It is used to describe the component units of the file system and is also the basic unit of data processing. Common blocks have different sizes, such as 512B, 1KB, 4KB, etc. Although a block is the basic unit, it is essentially composed of sectors.

There are two ways to implement address translation: segmentation mechanism and paging mechanism. The implementation of memory management in x86 combines segmentation and paging mechanisms. The following is a mapping diagram of virtual addresses converted to physical addresses after segmentation and paging

Linux 保护模式下的内存管理

For this Picture, it is necessary to explain:

First of all, this picture contains three addresses and the conversion process of these three addresses. Generally speaking, the logical address will become a linear address after segmented base address conversion. The linear address is the segment base in the protection mode. offset within the address segment, so this picture is an address translation diagram in protected mode. The linear address will be converted into a physical address after the paging mechanism, provided that the paging mechanism needs to be enabled; if the paging mechanism is not enabled, the linear address = physical address.

We need to talk about the logical address again. The logical address contains segment selectors and intra-segment offsets. The concept of segment selectors was relatively vague when I first came into contact with it. To put it simply, it can be understood as protection mode. The segment base address below, we all know that the segment base address is 16 bits, and the offset within the segment is 32 bits.

Many books or articles have mentioned segment selectors. In fact, segment selectors are segment selectors. This is entirely a matter of translation. In English, they are all selectors.

The segment descriptor will be mentioned later. The segment descriptor and the segment selector are not the same thing, but the segment selector is a 16-bit segment descriptor.

Let me tell you something that is not written in this picture. Now everyone knows that logical addresses can be converted into linear addresses, and linear addresses can be converted into physical addresses. So how is the root cause converted? In fact, the method used here is MMU (memory management unit) for conversion; and the conversion of linear addresses into physical addresses uses the hardware circuit of the paging unit. The focus of this article is not to discuss the specific conversion process, but to focus on the two mechanisms of segmentation and paging.

Let’s talk about the two mechanisms of segmentation and paging in detail.

Segmentation Mechanism

I recommend that you first read the description of "Why memory needs to be segmented" that I wrote.

https://www.php.cn/link/d005ce7aeef46bd18515f783fb8e87fa

Multiple programs run in the same memory space and will not interfere with each other. This is because Segmentation provides a mechanism to isolate areas of code, data, and stack. If there are multiple programs or tasks running in the CPU, each program can be allocated its own set of segments (including program code, data and stack). The CPU prevents applications from interfering with each other by strengthening the boundaries between segments. Purpose.

All segments used in a system are contained in the linear address space of the CPU. In order to locate a byte in a specified segment, the program must provide a logical address for the translation to occur. The logical address contains the segment selector and the offset within the segment. Each segment has a segment descriptor. The segment descriptor is used to indicate the size of the segment, access rights and privilege level of the segment, segment type, and the first byte of the segment is online. location in the sexual address space (segment base address). The offset part of the logical address is added to the segment base address to locate the position of a certain byte in the segment, so the segment base address offset forms the address in the CPU's linear address space.

Linear address space has the same structure as physical address space, but the segments they can accommodate are very different. Virtual address, that is, logical address space, can contain up to 16 K segments, and each segment can accommodate The size is 4 GB, so the virtual address can find a total of 64TB (2^46) segments, and the linear address and physical address space is 4GB (2^32). So, if paging is disabled, the linear address space is the physical address space.

Linux 保护模式下的内存管理

##This picture is the mapping diagram of logical address-> linear address-> physical address. The GDT table and the LDT table each occupy half of the address space, each is 8192 Each segment has a maximum length of 4G. Whether to query from the GDT table or the LDT table, which table to query depends on the TI attribute of the segment selector. The structure of the segment selector is as follows

Linux 保护模式下的内存管理

The segment selector is divided into three parts:

TI (Table Indicator): Indicates which table should be queried, TI = 0 to query the GDT table; TI = 1 to query the LDT table.
Index: The CPU will automatically add Index * 8, plus the segment base address in GDT and LDT, which is the segment descriptor to be loaded.

There is no detailed explanation of segment descriptors here, because this article still prefers memory management and is not too obsessed with certain details.

In GDTR, the logical address composed of the segment selector and offset can be synthesized into a segment descriptor and saved directly. Segment selectors and intra-segment offsets can be converted into linear addresses after passing through the MMU.

Paging mechanism

As we mentioned above, the linear address is converted from the logical address. If the paging mechanism is disabled, the linear address is the physical address. If the paging mechanism is enabled, the linear address and the logical address The number of address spaces is still different. Generally, programs are multitasking, and the linear address space usually defined by multitasking is much larger than the physical memory capacity. Why? The address translation map shows that both the linear address and the physical address are 4G in size. That's because linear addresses are virtualized by virtual storage technology.

Virtual storage is a memory management technology. Using this technology can give us the illusion that the memory space is much larger than the actual physical memory capacity. Its essence is to virtualize the memory, that is, the memory It may only be 4G, but you think the memory has 64G, so that’s why I can open so many applications.

The paging mechanism is actually an implementation of virtualization. In a virtualized environment, a large amount of linear address space will be mapped to a small piece of physical memory (RAM or ROM). When paging occurs, each segment is divided into pages (usually 4K), and these pages are stored in physical memory or on disk. The operating system maintains these pages by using a page directory and page tables. When a program attempts to access an address location in the linear address space, the CPU will use the page directory and page table to convert the linear address into a physical address and then store it in physical memory.

If the currently accessed page is not in physical memory, the CPU will execute an interrupt. The general error is a page exception. Then the operating system will read the page from the hard disk into physical memory, and then continue execution from the interrupt point. program. The operating system often swaps pages in and out frequently, which also becomes a performance bottleneck.

In segmentation, the length of each segment is not fixed, and the maximum length is 4G; while in paging, the size of each page is fixed. Whether in physical memory or on disk, using fixed-size pages is more suitable for managing physical memory; while the segmentation mechanism using variable-sized blocks is more suitable for processing logical partitions of complex systems.

Although segmentation and paging are two different address translation mechanisms, they are handled independently during the entire address translation process, and each process is independent. Both mechanisms use an intermediate table to store entry mappings, but the structure of this intermediate table is different. The segment table exists in the linear address space, and the page table is stored in the physical address space.

Protection mechanism

80x86 has two protection mechanisms, one of which achieves complete isolation between tasks by allocating different virtual address spaces to each task. This is achieved by giving each task a different transformation from logical address to physical address. Each application can only access data and instructions in its own virtual space, and can only obtain the physical address through its own mapping; the second mechanism is Protect tasks, protect the operating system's memory segments and some special registers from being accessed by applications. Let’s discuss these two tasks in detail below.

Protection between tasks

Each task will be placed separately in its own virtual address space, and then mapped into a physical address through hardware. Different virtual addresses will be transformed into different physical addresses. Address, there will be no virtual address of A and it will be mapped to the range of the physical address of B. This will isolate all tasks and different tasks will not interfere with each other.

Each task has its own mapping table, segment table and page table. When the CPU switches between different applications or tasks, these tables will also switch.

Virtual address is an abstraction of the operating system, which means that the virtual address is completely abstracted by the operating system and can better manage applications and tasks. Each task can map the logical address into a virtual address. , which also means that each task can access the operating system, and the operating system can be shared by all tasks. This part of the virtual address space where all tasks have the same virtual address space is called the global address space, and Linux uses the global address space.

Each task in the global address space has its own unique virtual address space. This virtual address space is called local address space (Local address space).

Special protection of memory segments and registers

If the protection of the operating system between different tasks is likened to horizontal protection, then the protection of memory segments and registers can be regarded as vertical protection. In order to restrict access to various segments within a task, the operating system sets four privilege levels to protect each task.

Priority is divided into 4 levels, 0 is the highest and 3 is the lowest. Generally, the most sensitive data will be given the highest priority, and they can only be accessed by the most trusted part of the task. Less sensitive data will be given lower priority; kernel operating system access is generally level 0, and application data is generally Level 3. Each memory segment is associated with a privilege level.

We know that the CPU obtains instructions and data from the segment through CS for execution. The instructions and data obtained from the segment have a privilege level. They are generally accessed with the current privilege level (Current Privilege Level). The CPL is the current active code. Privileged level. When an application attempts to access a segment, it is compared with this privilege level, and only privilege levels lower than this segment can be accessed.

The above is the detailed content of Memory management in Linux protected mode. For more information, please follow other related articles on the PHP Chinese website!

Statement

This article is reproduced at:51CTO.COM. If there is any infringement, please contact admin@php.cn delete

NGINX vs. Apache: A Look at Their ArchitecturesApr 28, 2025 am 12:13 AM

The main architecture difference between NGINX and Apache is that NGINX adopts event-driven, asynchronous non-blocking model, while Apache uses process or thread model. 1) NGINX efficiently handles high-concurrent connections through event loops and I/O multiplexing mechanisms, suitable for static content and reverse proxy. 2) Apache adopts a multi-process or multi-threaded model, which is highly stable but has high resource consumption, and is suitable for scenarios where rich module expansion is required.

NGINX vs. Apache: Examining the Pros and ConsApr 27, 2025 am 12:05 AM

NGINX is suitable for handling high concurrent and static content, while Apache is suitable for complex configurations and dynamic content. 1. NGINX efficiently handles concurrent connections, suitable for high-traffic scenarios, but requires additional configuration when processing dynamic content. 2. Apache provides rich modules and flexible configurations, which are suitable for complex needs, but have poor high concurrency performance.

NGINX and Apache: Understanding the Key DifferencesApr 26, 2025 am 12:01 AM

NGINX and Apache each have their own advantages and disadvantages, and the choice should be based on specific needs. 1.NGINX is suitable for high concurrency scenarios because of its asynchronous non-blocking architecture. 2. Apache is suitable for low-concurrency scenarios that require complex configurations, because of its modular design.

NGINX Unit: Key Features and CapabilitiesApr 25, 2025 am 12:17 AM

NGINXUnit is an open source application server that supports multiple programming languages and provides functions such as dynamic configuration, zero downtime updates and built-in load balancing. 1. Dynamic configuration: You can modify the configuration without restarting. 2. Multilingual support: compatible with Python, Go, Java, PHP, etc. 3. Zero downtime update: Supports application updates that do not interrupt services. 4. Built-in load balancing: Requests can be distributed to multiple application instances.

NGINX Unit vs. Other Application ServersApr 24, 2025 am 12:14 AM

NGINXUnit is better than ApacheTomcat, Gunicorn and Node.js built-in HTTP servers, suitable for multilingual projects and dynamic configuration requirements. 1) Supports multiple programming languages, 2) Provides dynamic configuration reloading, 3) Built-in load balancing function, suitable for projects that require high scalability and reliability.

NGINX Unit: The Architecture and How It WorksApr 23, 2025 am 12:18 AM

NGINXUnit improves application performance and manageability with its modular architecture and dynamic reconfiguration capabilities. 1) Modular design includes master processes, routers and application processes, supporting efficient management and expansion. 2) Dynamic reconfiguration allows seamless update of configuration at runtime, suitable for CI/CD environments. 3) Multilingual support is implemented through dynamic loading of language runtime, improving development flexibility. 4) High performance is achieved through event-driven models and asynchronous I/O, and remains efficient even under high concurrency. 5) Security is improved by isolating application processes and reducing the mutual influence between applications.

Using NGINX Unit: Deploying and Managing ApplicationsApr 22, 2025 am 12:06 AM

NGINXUnit can be used to deploy and manage applications in multiple languages. 1) Install NGINXUnit. 2) Configure it to run different types of applications such as Python and PHP. 3) Use its dynamic configuration function for application management. Through these steps, you can efficiently deploy and manage applications and improve project efficiency.

NGINX vs. Apache: A Comparative Analysis of Web ServersApr 21, 2025 am 12:08 AM

NGINX is more suitable for handling high concurrent connections, while Apache is more suitable for scenarios where complex configurations and module extensions are required. 1.NGINX is known for its high performance and low resource consumption, and is suitable for high concurrency. 2.Apache is known for its stability and rich module extensions, which are suitable for complex configuration needs.

See all articles

Hot AI Tools

Undresser.AI Undress

AI-powered app for creating realistic nude photos

AI Clothes Remover

Online AI tool for removing clothes from photos.

Undress AI Tool

Undress images for free

Clothoff.io

AI clothes remover

Video Face Swap

Swap faces in any video effortlessly with our completely free AI face swap tool!

Hot Article

What's New in Windows 11 KB5054979 & How to Fix Update Issues

3 weeks agoByDDD

How to fix KB5055523 fails to install in Windows 11?

2 weeks agoByDDD

InZoi: How To Apply To School And University

3 weeks agoByDDD

How to fix KB5055518 fails to install in Windows 10?

2 weeks agoByDDD

Roblox: Dead Rails – How To Summon And Defeat Nikola Tesla

4 weeks agoBy尊渡假赌尊渡假赌尊渡假赌

Hot Tools

MantisBT

Mantis is an easy-to-deploy web-based defect tracking tool designed to aid in product defect tracking. It requires PHP, MySQL and a web server. Check out our demo and hosting services.

EditPlus Chinese cracked version

Small size, syntax highlighting, does not support code prompt function

SublimeText3 Chinese version

Chinese version, very easy to use

ZendStudio 13.5.1 Mac

Powerful PHP integrated development environment

SecLists

SecLists is the ultimate security tester's companion. It is a collection of various types of lists that are frequently used during security assessments, all in one place. SecLists helps make security testing more efficient and productive by conveniently providing all the lists a security tester might need. List types include usernames, passwords, URLs, fuzzing payloads, sensitive data patterns, web shells, and more. The tester can simply pull this repository onto a new test machine and he will have access to every type of list he needs.

Hot Topics

Where is the login entrance for gmail email?

7801

1644

1402

1299

1236