Home  >  Article  >  Operation and Maintenance  >  Introduction to file types under Linux

Introduction to file types under Linux

青灯夜游
青灯夜游forward
2019-02-26 16:16:263988browse

The content of this article is to introduce several file types under Linux. It has certain reference value. Friends in need can refer to it. I hope it will be helpful to you. [Video tutorial recommendation: Linux tutorial]

Under the Linux system, there are seven types of files:

  • Ordinary files (- )

  • ##Directory (d)


  • soft link (character link L)


  • Socket File(S)


  • Character Device(S)


  • Block Equipment (B)


  • Pipe file (named pipe P)

Ordinary files, directories, and soft links require no further explanation. Let's take a look at pipe files, socket files, character devices, and block device types.

Pipe file

Pipes are divided into

anonymous pipes and named pipes. Pipes are written at one end and read at the other end. They are one-way data transmission, and their data are transmitted directly in memory. Pipes are a way of inter-process communication, such as parent process writing and child process reading. .

In the shell, the anonymous pipe is a pipe symbol "|", such as

ls | grep xxx, where the process corresponding to ls is the parent process in this independent process group, and the process corresponding to grep It is a child process. The parent process writes and the child process reads.

In programming languages, anonymous pipes are implemented by creating two file handles or file descriptors (such as A, B). One file handle is used to write data (such as A writing end, data writing end Entering A will automatically push it into B), and another file handle is used to read data (i.e. B).

For named pipes, that is, named pipes, named pipes keep files in the file system. It is also called FIFO, which means first in first out. Although the named pipe file is retained in the file system, this file is only an entry point for using the named pipe. When using the named pipe to transmit data, it is still performed in memory, which means that it will not be retained on the file system. Named pipes are less efficient.

In the shell, you can use the

mknod command or the mkfifo command to create a named pipe. Named pipes are very useful when writing shell scripts with certain special needs. In fact, the function of coroutines (using the coproc command) has been supported since Bash 4 (ksh and zsh have long supported coroutines), but the needs of coroutines can be realized through named pipes.

General pipelines are one-way communication and cannot realize the function of two-way communication, that is, they can only write and read at the same time, but cannot read and write on both sides. If you want to achieve two-way communication, you can create two pipes (so there are 4 file handles, two reading ends, and two writing ends), or use a more convenient socket.

Socket

Socket is used to realize communication between both ends. As analyzed above, it can realize the inter-process communication function of bidirectional pipeline. Not only that, sockets can also realize inter-process communication across hosts through the network.

Sockets need to be paired to be meaningful, that is, they are divided into two ends. Each end has a file descriptor (or file handle) for reading and writing, which is equivalent to two two-way communication pipes.

Sockets are divided into two categories according to the protocol family: network sockets (AF_INET type, divided into inet4 and inet6 according to ipv4 and ipv6) and Unix Domain sockets (AF_UNIX type). Of course, from the protocol family down, sockets can be subdivided into many types. For example, INET sockets can be divided into TCP sockets, UDP sockets, link layer sockets, Raw sockets, etc. . Among them, network sockets are the foundation and core of network programming.

Unix Domain Socket

For stand-alone inter-process communication, it is better to use Unix Domain socket than Inet socket, because Unix Domain socket has no network communication component, and It just lacks a lot of network functions and is more lightweight. In fact, the pipeline functions implemented by some languages ​​on certain operating system platforms are implemented through Unix Domain, and one can imagine its high efficiency.

Unix Domain socket has two file handles (such as A, B). Both file handles are readable and writable at the same time. When process 1 writes data to A, it will be automatically pushed to B. Process 2 can read the data written from A from B. Similarly, when process 2 writes data to B, it will be automatically pushed to A. Process 1 can Read the data written from B from A. As follows:

进程1            进程2
------------------------
A   ----------->  B
B   ----------->  A
In the programming language, creating a Unix Domain Socket naturally has corresponding functions to easily create it (can

man socketpair). For bash shell, you can create it through the nc command (NetCat), or simply use two named pipes to implement the corresponding functions. If necessary, you can learn how to use Unix Domain sockets in the bash shell.

Network Sockets

For inter-process communication across a network, network sockets are required. Every network socket is made up of 5 parts, which are called the socket’s 5-tuple. The format is as follows:

{protocol, src_addr, src_port, dest_addr, dest_port}
That is, protocol, source address, source port, destination address, and destination port.

Each end of the socket has two buffers in the kernel space (that is, a pair of sockets has 4 buffers), and each end has recv buffer and send buffer. Process 1 writes data to the send buffer of its own socket, which will be sent to the peer's recv buffer, and then process 2 of the peer can read data from the recv buffer, and vice versa.

But before you can actually read and write the network socket, the network socket still needs some settings. After the server socket is created (socket() function, there will be a file handle or file descriptor for reading and writing operations), it must also bind the address (through the bind() function) and the listening port (through listen () function), the client only needs to create the socket and directly use the connect() function to initiate a connection request to the server socket.

For TCP sockets, when the client initiates a connection request, it means that it needs to perform a three-way handshake with the server (completed by the kernel and has nothing to do with the user space process). Break down each of these three handshakes. The first time the client sends a SYN request, after the server receives the SYN, the kernel puts the connection into the syn queue and sets the status to syn-recv, and then sends ack syn to the client. On the other side, after receiving the client's reply ack, the kernel moves the connection from the syn queue to the established queue (or accept queue) and marks the connection's status as established. Finally, the process waiting for user space initiates the accept() system call to let the kernel remove it from the accept queue. The connection after being accepted() indicates that the connection has been established, which can truly realize data transmission between the processes at both ends.

For more about the principles of TCP sockets, see my other article: The must-know socket and TCP connection process.

Block devices and character devices

Block devices are hardware devices that are distinguished by random (not necessarily sequential) access to fixed-size chunks of data. A fixed-size chunk is called a block. The most common block device is the hard disk, but many other block devices also exist, such as floppy drives, Blu-ray readers, and flash memory. Note that these are devices on which file systems are mounted, and file systems are like a lingua franca for block devices.

Character devices are accessed through a continuous stream of data, byte after byte. Typical character devices are terminals (there are many types of terminals, both physical and virtual) and keyboard.

The easiest way to distinguish block devices and character devices is to look at the way data is accessed. Block devices can be accessed randomly to obtain data, and character devices must be accessed in byte order .

If you can read a little data here, read a little data there, and finally string it into a continuous piece of data, then this is a block device. Just like the data on the hard disk is discontinuous, it may need to be accessed through random access. method to obtain a piece of data. For example, in a slightly larger file on a disk, the first 10k data may be in contiguous data blocks or in contiguous sectors, and the next 10k data may be far away from it or even on different cylinders.

If each byte in a piece of data is in the same byte order as when accessed, that is, the byte order is completely consistent from the time of access to the final processing of the data, then This is a character device. In other words, character devices can be thought of as stream devices. Just like inputting data on a keyboard, if two keys are pressed continuously, when the byte data corresponding to these two keys is received, they must be typed first in the front and then in the back. In the same way, the terminal device works the same way. When the program outputs data to the terminal, the program first outputs the letter a and then the number 3. Then when displayed on the terminal, a must be in front and 3 in the back.

The above is the detailed content of Introduction to file types under Linux. For more information, please follow other related articles on the PHP Chinese website!

Statement:
This article is reproduced at:cnblogs.com. If there is any infringement, please contact admin@php.cn delete