search
HomeOperation and MaintenanceLinux Operation and MaintenanceDetailed explanation of file operations in server programming

Detailed explanation of file operations in server programming

Jun 20, 2017 am 11:57 AM
documentserverprogramming

Everything under the Linux system is a file. All the underlying layers are shielded through the virtual file system (VFS) mechanism. Users can operate different drivers through a unified interface. Each file needs a reference. Indicates that the file descriptor is applied at this time. The file descriptor is similar to the handle under windows. Most operations on the file are operated through this descriptor, such as read and write. For each file descriptor, the kernel uses three data structures to manage it.

(1) Each process has a record entry in the process table, and each record entry has an open file descriptor table, which can Treated as a vector, each descriptor occupies one entry. Associated with each file descriptor is:

 (a) File descriptor flag. (Currently only one file descriptor flag FD_CLOEXEC is defined)

 (b) Pointer to a file table entry.

(2) The kernel maintains a file table for all open files. Each file entry contains:

 (a) File status flag (read, write, add-write, synchronization, non-blocking, etc.).

 (b) Current file displacement. (That is, the value operated by the lseek function)

 (c) Pointer to the v-node entry of the file.

(3) Each open file (or device) has a v node structure. The v node contains pointer information about the file type and functions that perform various operations on this file. For most files, the v node also contains the i node (index node) of the file. This information is read from disk into memory when the file is opened, so all information about the file is quickly available. For example, the i-node contains the file's owner, the file's length, the device on which the file resides, a pointer to the actual data blocks used by the file on disk, and so on.

After the three-layer encapsulation of the above file system, each layer is responsible for different responsibilities, from top to bottom the first layer It is used to identify files, the second layer is used to manage process independent data, and the third layer manages file system metadata and is directly associated with a file. One advantage of this layered idea is that the upper layer can reuse the structure of the lower layer. There may be multiple file descriptor entries pointing to the same file table entry, and there may be multiple file table entries pointing to the same V node.

If two independent processes open the same file, each process that opens the file will get a file table entry, but the V node pointers of the two file table entries point to the same V node, this arrangement allows each process to have its own current displacement of the file, and supports different opening methods (O_RDONLY, O_WRONLY, ORDWR).

When a process creates a child process through fork, the file descriptors in the parent and child processes share the same file table entry, that is to say, the file descriptors of the parent and child processes pointing in the same direction. Generally, we will close the fd that we do not need after fork. For example, when the parent and child processes communicate through pipe or socketpair, they will often close the end that they do not need to read (or write). Only when there is no file descriptor referencing the current file entry, the close operation actually destroys the current file entry data structure, which is somewhat similar to the idea of ​​reference counting. This is also the difference between the close and shutdown functions in network programming. The former only truly disconnects when the last process using the socket handle is closed, while the latter directly disconnects one side of the connection without any discussion. However, in a multi-threaded environment, since the father and son threads share the address space, the file descriptors are jointly owned and there is only one copy. Therefore, you cannot close the FD that you do not need in the thread, otherwise it will cause other files that need the FD to be closed. Threads are also affected. Because the file descriptors opened in the parent and child processes share the same file table entry, in server programming of some systems, if the preforking model is used (the server pre-derives multiple child processes, and each child process listens to listenfd to accept the connection) This will lead to the occurrence of the thundering herd phenomenon. Multiple sub-processes derived from the server each call accept and are therefore put to sleep. When the first client connection arrives, although only one process obtains the connection, all processes are awakened, resulting in Performance suffers. See UNP P657.

At the same time, if exec is called after fork, all file descriptors will continue to remain open. This can be used to pass certain file descriptors to the program after exec. At the same time, the file descriptor flag FD_CLOEXEC is an option used to keep open file descriptors when closing exec.

You can also explicitly copy a file descriptor through dup or fcntl, and they point to the same file table entry. Copy the file descriptor to the specified value through dup2.

Each process has a file descriptor table, which is independent between processes. There is no direct relationship between the file descriptors between the two processes, so the file description can be passed directly within the process. descriptor, but it loses meaning if passed across processes. Unix can pass special file descriptors through sendmsg/recvmsg (see UNP section 15.7). The first three file descriptors of each process correspond to standard input, standard output, and standard error. However, there is a limit to the number of file descriptors that can be opened by a process. If there are too many open file descriptors, the problem of "Too many open files" will occur. In the network server, when accept is called through listenfd, an EMFILE error is generated. This is mainly because the file descriptor is an important resource of the system. The system resources are exhausted. The system limits the default value of the file descriptor of a single process. It is 1024, which can be viewed using the ulimit -n command. Of course, you can also increase the number of process file descriptors, but this is a temporary solution rather than a permanent solution, because when dealing with high-concurrency services, server resources are limited and resource exhaustion is inevitable.

When combined with the horizontal triggering method of epoll to listen to lisenfd connections, a large number of socket connections will fill up the TCP connection queue if not processed, and listenfd will always generate readable events. To put the server into busy waiting, Chen Shuo, the author of the C++ open source network library muduo, uses the method of preparing an idle file descriptor in advance. When an EMFILE error occurs, he first closes the idle file, obtains a file descriptor quota, and then accepts it. The file descriptor of a socket connection is then closed immediately, thus gracefully disconnecting the connection from the client, and finally reopening the idle file to fill in the "hole" in case this situation occurs again.

 1 //在程序开头先”占用”一个文件描述符 2  3 int idlefd = open("/dev/null", O_RDONLY | O_CLOEXEC); 4 ………… 5  6 //然后当出现EMFILE错误的时候处理这个错误 7  8 peerlen = sizeof(peeraddr); 9 connfd = accept4(listenfd,  (struct sockaddr*)&peeraddr, &peerlen, SOCK_NONBLOCK | SOCK_CLOEXEC);10 11 if (connfd == -1)12 {13     if (errno == EMFILE)14     {15         close(idlefd);16         idlefd = accept(listenfd, NULL, NULL);17         close(idlefd);18         idlefd = open("/dev/null", O_RDONLY | O_CLOEXEC);19         continue;20     }21     else22         ERR_EXIT("accept4");23 }

The above is the detailed content of Detailed explanation of file operations in server programming. For more information, please follow other related articles on the PHP Chinese website!

Statement
The content of this article is voluntarily contributed by netizens, and the copyright belongs to the original author. This site does not assume corresponding legal responsibility. If you find any content suspected of plagiarism or infringement, please contact admin@php.cn
The 5 Core Components of the Linux Operating SystemThe 5 Core Components of the Linux Operating SystemMay 08, 2025 am 12:08 AM

The five core components of the Linux operating system are: 1. Kernel, 2. System libraries, 3. System tools, 4. System services, 5. File system. These components work together to ensure the stable and efficient operation of the system, and together form a powerful and flexible operating system.

The 5 Essential Elements of Linux: ExplainedThe 5 Essential Elements of Linux: ExplainedMay 07, 2025 am 12:14 AM

The five core elements of Linux are: 1. Kernel, 2. Command line interface, 3. File system, 4. Package management, 5. Community and open source. Together, these elements define the nature and functionality of Linux.

Linux Operations: Security and User ManagementLinux Operations: Security and User ManagementMay 06, 2025 am 12:04 AM

Linux user management and security can be achieved through the following steps: 1. Create users and groups, using commands such as sudouseradd-m-gdevelopers-s/bin/bashjohn. 2. Bulkly create users and set password policies, using the for loop and chpasswd commands. 3. Check and fix common errors, home directory and shell settings. 4. Implement best practices such as strong cryptographic policies, regular audits and the principle of minimum authority. 5. Optimize performance, use sudo and adjust PAM module configuration. Through these methods, users can be effectively managed and system security can be improved.

Linux Operations: File System, Processes, and MoreLinux Operations: File System, Processes, and MoreMay 05, 2025 am 12:16 AM

The core operations of Linux file system and process management include file system management and process control. 1) File system operations include creating, deleting, copying and moving files or directories, using commands such as mkdir, rmdir, cp and mv. 2) Process management involves starting, monitoring and killing processes, using commands such as ./my_script.sh&, top and kill.

Linux Operations: Shell Scripting and AutomationLinux Operations: Shell Scripting and AutomationMay 04, 2025 am 12:15 AM

Shell scripts are powerful tools for automated execution of commands in Linux systems. 1) The shell script executes commands line by line through the interpreter to process variable substitution and conditional judgment. 2) The basic usage includes backup operations, such as using the tar command to back up the directory. 3) Advanced usage involves the use of functions and case statements to manage services. 4) Debugging skills include using set-x to enable debugging mode and set-e to exit when the command fails. 5) Performance optimization is recommended to avoid subshells, use arrays and optimization loops.

Linux Operations: Understanding the Core FunctionalityLinux Operations: Understanding the Core FunctionalityMay 03, 2025 am 12:09 AM

Linux is a Unix-based multi-user, multi-tasking operating system that emphasizes simplicity, modularity and openness. Its core functions include: file system: organized in a tree structure, supports multiple file systems such as ext4, XFS, Btrfs, and use df-T to view file system types. Process management: View the process through the ps command, manage the process using PID, involving priority settings and signal processing. Network configuration: Flexible setting of IP addresses and managing network services, and use sudoipaddradd to configure IP. These features are applied in real-life operations through basic commands and advanced script automation, improving efficiency and reducing errors.

Linux: Entering and Exiting Maintenance ModeLinux: Entering and Exiting Maintenance ModeMay 02, 2025 am 12:01 AM

The methods to enter Linux maintenance mode include: 1. Edit the GRUB configuration file, add "single" or "1" parameters and update the GRUB configuration; 2. Edit the startup parameters in the GRUB menu, add "single" or "1". Exit maintenance mode only requires restarting the system. With these steps, you can quickly enter maintenance mode when needed and exit safely, ensuring system stability and security.

Understanding Linux: The Core Components DefinedUnderstanding Linux: The Core Components DefinedMay 01, 2025 am 12:19 AM

The core components of Linux include kernel, shell, file system, process management and memory management. 1) Kernel management system resources, 2) shell provides user interaction interface, 3) file system supports multiple formats, 4) Process management is implemented through system calls such as fork, and 5) memory management uses virtual memory technology.

See all articles

Hot AI Tools

Undresser.AI Undress

Undresser.AI Undress

AI-powered app for creating realistic nude photos

AI Clothes Remover

AI Clothes Remover

Online AI tool for removing clothes from photos.

Undress AI Tool

Undress AI Tool

Undress images for free

Clothoff.io

Clothoff.io

AI clothes remover

Video Face Swap

Video Face Swap

Swap faces in any video effortlessly with our completely free AI face swap tool!

Hot Article

Hot Tools

VSCode Windows 64-bit Download

VSCode Windows 64-bit Download

A free and powerful IDE editor launched by Microsoft

SAP NetWeaver Server Adapter for Eclipse

SAP NetWeaver Server Adapter for Eclipse

Integrate Eclipse with SAP NetWeaver application server.

SecLists

SecLists

SecLists is the ultimate security tester's companion. It is a collection of various types of lists that are frequently used during security assessments, all in one place. SecLists helps make security testing more efficient and productive by conveniently providing all the lists a security tester might need. List types include usernames, passwords, URLs, fuzzing payloads, sensitive data patterns, web shells, and more. The tester can simply pull this repository onto a new test machine and he will have access to every type of list he needs.

Atom editor mac version download

Atom editor mac version download

The most popular open source editor

PhpStorm Mac version

PhpStorm Mac version

The latest (2018.2.1) professional PHP integrated development tool