Home  >  Article  >  Operation and Maintenance  >  Introduction to the concept of file descriptors and FILE

Introduction to the concept of file descriptors and FILE

PHP中文网
PHP中文网Original
2017-06-21 13:44:233206browse

1. File descriptor (key point)

Everything in the Linux system can be regarded as a file, and files can be divided into: ordinary files, directory files, link files and device files. A file descriptor is an index created by the kernel in order to efficiently manage opened files. It is a non-negative integer (usually a small integer) used to refer to the opened file. All I/O operations are performed. All system calls go through file descriptors. When the program is first started, 0 is standard input, 1 is standard output, and 2 is standard error. If you open a new file at this time, its file descriptor will be 3.

1.1 Concept introduction

File descriptor operations (such as: open(), creat(), close(), read())) return is a file descriptor, which is an integer of type int, that is, fd. Its essence is the subscript in the file descriptor table. It acts as an index. The process finds the fd through the file descriptor table in the PCB. The file pointer pointed to by filp. Each process stores a file descriptor table in the PCB (Process Control Block), which is the process control block. The file descriptor is the index of this table. Each entry in the file description table has a pointer to the opened file. Pointer; The opened file is represented by the file structure in the kernel, and the pointer in the file descriptor table points to the file structure. Each time a file is opened, fd is allocated starting from the smallest unused index by default. Disadvantages of file descriptors: they cannot be ported to systems other than UNIX, and they are not intuitive.

Draw a picture below to show the relationship between them:

And each file mainly contains the following information:

1.2 Chart Explanation

Maintain the File Status Flag (file member of the structure f_flags) and the current read and write position (##) in the file structure #fileMembers of structuref_pos). In the figure above, both process 1 and process 2 open the same file, but correspond to different file structures, so they can have different File Status Flags and read and write locations. fileThe more important members of the structure are f_count, which represents the reference count. We will talk about it later, dup, fork and other system calls will cause multiple file descriptors to point to the same file structure. For example, fd1 and fd2 both refer to the same file structure, then its reference count is 2. When close(fd1), the file structure will not be released, but only the reference count will be reduced to 1. If you close(fd2) again, the reference count will be reduced to 0 and the file structure will be released, and then the file will actually be closed.

Each

file structure points to a file_operations structure. The members of this structure are function pointers, pointing to kernel functions that implement various file operations. For example, read a file descriptor in the user program, read enters the kernel through a system call, and then finds the file structure pointed to by the file descriptor, and finds The file_operations structure pointed to by the file structure calls the kernel function pointed to by its read member to complete the user request. Calling lseek, read, write, ioctl, open and other functions in the user program will ultimately be done by The kernel calls the kernel function pointed to by each member of file_operations to complete the user request. The release member in the file_operations structure is used to complete the close request of the user program. The reason why it is called release instead of close is because it does not necessarily close the file, but reduces the reference count. The file is closed only when the reference count is reduced to 0. For regular files opened on the same file system, the steps and methods of file operations such as read and write should be the same, and the functions called should be the same, so the figure The file structures of the three open files point to the same file_operations structure. If you open a character device file, then its read and write operations are definitely different from those of regular files. They do not read and write disk data blocks but read and write hardware devices, so ## The #file structure should point to different file_operations structures, in which various file operation functions are implemented by the driver of the device. <p>Each <code>file structure has a pointer to the dentry structure. "dentry" is the abbreviation of directory entry. The parameters we pass to open, stat and other functions are a path, such as /home/akaedu/a, and we need to find the inode of the file based on the path. In order to reduce the number of disk reads, the kernel caches the tree structure of the directory, called dentry cache, in which each node is a dentry structure. Just search for the dentry along each part of the path, starting from the root. Directory /find the home directory, then find the akaedu directory, then find the file a. The dentry cache only saves recently accessed directory entries. If the directory entry you are looking for is not in the cache, it must be read from the disk into the memory.

Each dentry structure has a pointer pointing to the inode structure. inodeThe structure stores the information read from the disk inode. In the example above, there are two dentries, representing /home/akaedu/a and /home/akaedu/b respectively. They both point to the same inode, indicating that these two files They are hard links to each other. inodeThe structure stores information read from the inode of the disk partition, such as owner, file size, file type, permission bits, etc. Each inode structure has a pointer to the inode_operations structure, which is also a set of function pointers pointing to some kernel functions that complete file directory operations. Unlike file_operations, inode_operations does not point to functions that operate on a certain file, but functions that affect the layout of files and directories, such as adding, deleting files and directories, and tracking symbolic links. Etc., each inode structure belonging to the same file system can point to the same inode_operations structure. The

inode structure has a pointer to the super_block structure. super_blockThe structure stores information read from the super block of the disk partition, such as file system type, block size, etc. The s_root member of the super_block structure is a pointer to dentry, indicating where the root directory of this file system is mount, in In the example above, this partition is mount to the /home directory.

file, dentry, inode, super_block These structures form VFS (Virtual File System VFS , the core concept of Virtual Filesystem).

1.3 Operations on file descriptors

(1). View Linux file descriptors

 1 [root@localhost ~]# sysctl -a | grep -i file-max --color 3 fs.file-max = 392036 5 [root@localhost ~]# cat /proc/sys/fs/file-max 7 392036 9 [root@localhost ~]# ulimit -n11 102413 [root@localhost ~]#

The limitations of the maximum file descriptor under Linux are There are two aspects, one is user-level restrictions, and the other is system-level restrictions.

System-level restrictions: The values ​​viewed in the sysctl command and the proc file system are the same. This is a system-level restriction. It is the sum of the restrictions on open file descriptors for all users

User-level restrictions: What the ulimit command sees is the user-level maximum file descriptor limit, which means that the total number of file descriptors occupied by the programs executed after each user logs in cannot exceed this limit

(2). Modify the value of the file descriptor

1 [root@localhost ~]# ulimit-SHn 102402 [root@localhost ~]# ulimit  -n3 102404 [root@localhost ~]#

The above modification only affects the current session and is temporary. If permanent modification is required, then To modify as follows:

1 [root@localhost ~]# grep -vE'^$|^#' /etc/security/limits.conf2 *                hard nofile                  40963 [root@localhost ~]#
1 //默认配置文件中只有hard选项,soft 指的是当前系统生效的设置值,hard 表明系统中所能设定的最大值2 [root@localhost ~]# grep -vE'^$|^#' /etc/security/limits.conf3 *      hard         nofile       102404 *      soft         nofile      102405 [root@localhost ~]#6 // soft<=hard soft的限制不能比hard限制高

(3). Modify system restrictions

1 [root@localhost ~]# sysctl -wfs.file-max=4000002 fs.file-max = 4000003 [root@localhost ~]# echo350000 > /proc/sys/fs/file-max  //重启后失效4 [root@localhost ~]# cat /proc/sys/fs/file-max5 3500006 [root@localhost ~]#

//The above is a temporary modification File descriptor
//Permanently modify, add fs.file-max=400000 to /etc/sysctl.conf, use sysctl -p

1.4 View the file descriptor with a program

The following program opens the /home/shenlan/hello.c file. If there is no hello.c file in this directory, the program automatically creates the file returned by the program. The descriptor is 3. Because when the process starts, three files, standard input (0), standard output (1) and standard error handling (2) , are opened. By default, fd starts to be allocated from the smallest unused index, so The file descriptor returned is 3.

 1 #include<stdio.h> 2 #include<sys/types.h> 3 #include<sys/stat.h> 4 #include<fcntl.h> 5 #include<stdlib.h> 6 int main() 7 { 8        int fd; 9        if((fd = open("/home/shenlan/fd.c",O_CREAT|O_WRONLY|O_TRUNC,0611))<0){10               perror("openfile fd.c error!\n");11               exit(1);12        }13        else{14               printf("openfile fd.c success:%d\n",fd);15        }16        if(close(fd) < 0){17               perror("closefile fd.c error!\n");18               exit(1);19        }20        else21               printf("closefile fd.c success!\n");22        exit(0);23 }

执行结果:

1.5进程打开一个文件的具体流程    

进程通过系统调用open( )来打开一个文件,实质上是获得一个文件描述符,以便进程通过文件描述符为连接对文件进行其他操作。进程打开文件时,会为该文件创建一个file对象,并把该file对象存入进程打开文件表中(文件描述符数组),进而确定了所打开文件的文件描述符。        open( )操作在内核里通过sys_open( )实现的,sys_open( )将创建文件的dentryinodefile对象,并在file_struct结构体的进程打开文件表fd_array[NR_OPEN_DEFAULT]中寻找一个空闲表项,然后返回这个表项的下标(索引),即文件描述符。创建文件的file对象时,将file对象的f_op指向了所属文件系统的操作函数集file_operations,而该函数集又来自具体文件的i节点,于是虚拟文件系统就与实际文件系统的操作衔接起来了。

 2.C标准库中的FILE结构和文件描述符

C语言中使用的是文件指针而不是文件描述符做为I/O的句柄."文件指针(file pointer)"指向进程用户区中的一个被称为FILE结构的数据结构。FILE结构包括一个缓冲区和一个文件描述符值.而文件描述符值是文件描述符表中的一个索引.从某种意义上说文件指针就是句柄的句柄。流(如: fopen)返回的是一个FILE结构指针, FILE结构是包含有文件描述符的,FILE结构函数可以看作是对fd直接操作的系统调用的封装, 它的优点是带有I/O缓存。

从文件描述符fd 到文件流 FILE* 的函数是
FILE* fdopen(int filedes,const char* mode);

早期的C标准库中,FILEstdio.h中定义Turbo C中,参见谭浩强的《C程序设计》,FILE结构体中包含成员fd,即文件描述符。亦可以在安装的Ubuntu系统的/usr/include/stdio.h中找到struct _IO_FILE结构体,这个结构体比较复杂,我们只关心需要的部分-文件描述符,但是在这个的结构体中,我们并没有发现与文件描述符相关的诸如fd成员变量。此时,类型为int_fileno结构体成员引起了我们的注意,但是不能确定其为文件描述符。因此写个程序测试是最好的办法,可以用以下的代码测试:

 1 #include<stdio.h> 2 #include<stdlib.h> 3 #include<sys/types.h> 4 #include<sys/stat.h> 5 #include<fcntl.h> 6 int main( ) 7 { 8        char buf[50] = {"ILOVE this game!"}; 9        FILE *myfile;10 11        myfile = fopen("2.txt","w+");12        if(!myfile){13               printf("error:openfile failed!\n");14        }15        printf("The openedfile's descriptor is %d\n",myfile->_fileno);16        if(write(myfile->_fileno,buf,50)< 0){17               perror("error:writefile failed!\n");18               exit(1);19        }else{20               printf("writefile successed!\n");21        }22        exit(0);23 }

In the program, use the fopen function to open the 2.txt file for reading and writing. If the 2.txt file does not exist, create this document. And it returns the FILE pointer myfile. Use printf to print out the value of myfile->_fileno to a standard terminal, and pass myfile->_fileno as a file descriptor to write System call to write buffer data to the open file. Then use the cat command to view the contents of 2.txt. The execution result is shown in the figure. The value of _fileno is 3 because the standard input, output, and errors are 0, 1, and 2. The output result is as follows:
Therefore, the _fileno member is the handle (windows system) or file descriptor returned by the operating system when opening the file. For in-depth study, you can read "C Standard Library" published by People's Posts and Telecommunications Publishing House. Of course, you can also read the /glibc-2.9/manual/io.txti file. In Linux, the file descriptor allocation is to check whether the file descriptor has been used one by one from small to large, and then allocate it. You can also write a program to test.

The file descriptor table is also called the file descriptor array, which stores all the files opened by a process. The file descriptor array is contained in the file table files_struct structure opened by the process. Defined in /include/linux/fdtable.h, it is an array of pointers pointing to the file type ---fd_array[NR_OPEN_DEFAULT], where NR_OPEN_DEFAULT is also defined in fdtable.h. This is a variable related to the specific CPU architecture, #define NR_OPEN_DEFAULTBITS_PER_LONG.

#FILE Structure The relationship between the file descriptor and the file structure can be represented by the following figure:

The above is the detailed content of Introduction to the concept of file descriptors and FILE. For more information, please follow other related articles on the PHP Chinese website!

Statement:
The content of this article is voluntarily contributed by netizens, and the copyright belongs to the original author. This site does not assume corresponding legal responsibility. If you find any content suspected of plagiarism or infringement, please contact admin@php.cn
Previous article:Things to pay attention to when expanding Linux under VMNext article:Things to pay attention to when expanding Linux under VM

Related articles

See more