Home > Article > Operation and Maintenance > what is linux socket
Socket, also known as socket, is a kind of Linux inter-process communication (IPC) method. It can not only achieve cross-process communication within the same host, but also achieve cross-process communication between different hosts. communication.
#The operating environment of this tutorial: linux5.9.8 system, Dell G3 computer.
The original meaning of socket is "socket". In the field of computer communication, socket is translated as "socket". It is a convention or a method for communication between computers. Through the socket convention, a computer can receive data from other computers and can also send data to other computers.
Socket in Linux
Socket is Linux cross-process communication (IPC, Inter Process Communication, for details, please refer to: Summary of Linux inter-process communication methods ) method. Compared with other IPC methods, the advantage of Socket is that it can not only achieve cross-process communication within the same host, but also achieve cross-process communication between different hosts. According to different communication domains, it can be divided into two types: Unix domain socket and Internet domain socket.
1. Internet domain socket
Internet domain socket is used to implement inter-process communication on different hosts. In most cases, the socket we refer to refers to the internet. domain socket. (Unless otherwise specified below, socket refers to internet domain socket.)
To achieve cross-process communication between different hosts, the first problem to be solved is how to uniquely identify a process. We know that each process on the host has a unique pid, and the pid can solve the problem of identifying cross-process communication processes on the same host. But if the two processes are not on the same host, the pid may be repeated, so it is not applicable in this scenario. Is there any other way? We know that the host can be uniquely locked through the host IP, and the program can be located through the port. For inter-process communication, we also need to know what protocol is used for communication. In this way, the combination of "IP port protocol" can uniquely identify a process on a host in the network. This is also the main parameter for generating socket.
After each process has a unique identifier, the next step is communication. Communication is a matter of slaps, there is a sender program and a receiver program, and Socket can be regarded as an endpoint in the communication connection between the two ends. The sender writes a piece of information into the sender Socket, and the sender Socket Send this piece of information to the receiving end Socket, and finally this piece of information is sent to the receiving end. As for how the information goes from the sending Socket to the receiving Socket, that is something that the operating system and network stack should worry about. We don't need to know the details. As shown in the figure below:
In order to maintain the connection at both ends, it is not enough for our Socket to have its own unique identifier. It also needs the unique identifier of the other party, so one of the above mentioned There are actually only half of the sending and receiving Sockets. A complete Socket should be a 5-dimensional array composed of [protocol, local address, local port, remote address, remote port]. For example, the Socket of the sending end is [tcp, sending end IP, sending end port, receiving end IP, receiving end port], then the Socket of the receiving end is [tcp, receiving end IP, receiving end port, sending end IP, sending end port].
Let’s use an analogy to deepen our understanding. For example, in the scenario where I send you a WeChat to contact you, we are the process, the WeChat client is the Socket, and the WeChat ID is our unique identifier. As for Tencent We don’t need to care about the details of how to send the WeChat messages I sent to your WeChat. In order to maintain the connection between the two of us, our Socket only has the WeChat client. We also have to add friends, so that we can find each other through the friend list. You in the friend list of my WeChat client are my complete Socket. And I in the friend list of your WeChat client is your complete Socket. Hope I didn't knock you out. . .
Socket can be divided into three types according to different communication protocols: stream socket (SOCK_STREAM), datagram socket (SOCK_DGRAM) and raw socket.
Streaming socket (SOCK_STREAM): The most common socket, using the TCP protocol, provides a reliable, connection-oriented communication stream. Ensure that data transmission is correct and sequential. Used in Telnet remote connections, WWW services, etc.
Datagram socket (SOCK_DGRAM): uses UDP protocol to provide connectionless services. Data is transmitted through independent messages, which is out of order and does not guarantee reliability. sex. Applications using UDP must have their own protocols for confirming data.
Raw socket: allows direct access to low-layer protocols such as IP or ICMP, mainly used for testing new network protocol implementations. Raw sockets are mainly used for the development of some protocols and can perform relatively low-level operations. It is powerful, but it is not as convenient to use as the two sockets introduced above, and ordinary programs do not involve original sockets.
The working process of the socket is shown in the figure below (taking the streaming socket as an example, the datagram socket process is different, you can refer to: What is a socket (Socket)) : The server starts first, establishes a socket by calling socket(), then calls bind() to associate the socket with the local network address, and then calls listen() to make the socket ready for listening. And specify the length of its request queue, and then call accept() to receive the connection. After establishing the socket, the client can call connect() to establish a connection with the server. Once the connection is established, data can be sent and received between the client and server by calling read() and write(). Finally, after the data transmission is completed, both parties call close() to close the socket.
The above process can be summarized from the perspective of TCP connection as shown in the figure. You can see that TCP's three-way handshake represents the process of establishing a Socket connection. After the connection is established, you can read through , wirte to transmit data to each other, and the last four times are waved to disconnect and delete the Socket.
2. Unix domain socket
Unix domain socket is also called IPC (inter-process communication) socket, used To achieve inter-process communication on the same host. Socket was originally designed for network communication, but later an IPC mechanism was developed based on the socket framework, which is UNIX domain socket. Although network sockets can also be used for inter-process communication on the same host (through loopback address 127.0.0.1), UNIX domain sockets are more efficient for IPC: they do not need to go through the network protocol stack, packaging and unpacking, and checksum calculations. , maintain sequence numbers and responses, etc., just copy application layer data from one process to another. This is because the IPC mechanism is inherently reliable communication, while network protocols are designed for unreliable communication.
UNIX domain socket is full-duplex, has rich API interface semantics, and has obvious advantages over other IPC mechanisms. It has become the most widely used IPC mechanism, such as between X Window servers and GUI programs. It communicates through UNIX domain socket. Unix domain socket is a component of the POSIX standard, so don't be confused by the name, Linux systems also support it.
Students who know Docker should know that the Docker daemon monitors a docker.sock file. The default path of the docker.sock file is /var/run/docker.sock. This Socket is a Unix domain socket. This will be introduced in detail in the later practical sessions.
Socket Practice
The best way to learn programming well is to practice. Next, let’s actually use Socket communication and observe the Socket file
1. Internet domain socket practice
Now we will use socket to write a server, because I am C I have less language experience, so I choose to practice with GoLang here. The function of the server is very simple, that is, it listens to the 1208 port. When it receives the input ping, it returns pong. When it receives the echo xxx, it returns xxx. When it receives the quit, it closes the connection. Code reference article for socket-server.go: Using Go for Socket Programming | Start with Luochen. As follows:
package main import ( "fmt" "net" "strings" ) func connHandler(c net.Conn) { if c == nil { return } buf := make([]byte, 4096) for { cnt, err := c.Read(buf) if err != nil || cnt == 0 { c.Close() break } inStr := strings.TrimSpace(string(buf[0:cnt])) inputs := strings.Split(inStr, " ") switch inputs[0] { case "ping": c.Write([]byte("pong\n")) case "echo": echoStr := strings.Join(inputs[1:], " ") + "\n" c.Write([]byte(echoStr)) case "quit": c.Close() break default: fmt.Printf("Unsupported command: %s\n", inputs[0]) } } fmt.Printf("Connection from %v closed. \n", c.RemoteAddr()) } func main() { server, err := net.Listen("tcp", ":1208") if err != nil { fmt.Printf("Fail to start server, %s\n", err) } fmt.Println("Server Started ...") for { conn, err := server.Accept() if err != nil { fmt.Printf("Fail to connect, %s\n", err) break } go connHandler(conn) } }
In a Unix-like system where everything is a file, the socket produced by the process is represented by the socket file, and the process realizes the transmission of messages by reading and writing content to the socket file. In Linux systems, the socket file is usually under the /proc/pid/fd/ file path. Start our socket-server and let's take a peek at the corresponding socket file. Start the server first:
# go run socket-server.go Server Started ...
Then open a window. We first check the pid of the server process. You can use the lsof or netstat command:
# lsof -i :1208 COMMAND PID USER FD TYPE DEVICE SIZE/OFF NODE NAME socket-se 20007 root 3u IPv6 470314 0t0 TCP *:1208 (LISTEN) # netstat -tupan | grep 1208 tcp6 0 0 :::1208 :::* LISTEN 20007/socket-server
You can see that our server pid is 20007. Next Let's check the socket monitored by the server:
# ls -l /proc/20007/fd total 0 lrwx------ 1 root root 64 Sep 11 07:15 0 -> /dev/pts/0 lrwx------ 1 root root 64 Sep 11 07:15 1 -> /dev/pts/0 lrwx------ 1 root root 64 Sep 11 07:15 2 -> /dev/pts/0 lrwx------ 1 root root 64 Sep 11 07:15 3 -> 'socket:[470314]' lrwx------ 1 root root 64 Sep 11 07:15 4 -> 'anon_inode:[eventpoll]'
You can see that /proc/20007/fd/3 is a link file pointing to socket:[470314], which is the socket on the server side. The startup of socket-server has gone through three processes: socket() --> bind() --> listen(). This LISTEN socket is created to listen for connection requests to port 1208.
We know that socket communication requires a pair of sockets: server side and client side. Now let's open another window and use telnet to start a client on the same machine as the socket-server. Let's take a look at the socket on the client side:
# telnet localhost 1208 Trying 127.0.0.1... Connected to localhost. Escape character is '^]'.
Continue to check the file descriptor opened by the server port;
# lsof -i :1208 COMMAND PID USER FD TYPE DEVICE SIZE/OFF NODE NAME socket-se 20007 root 3u IPv6 470314 0t0 TCP *:1208 (LISTEN) socket-se 20007 root 5u IPv6 473748 0t0 TCP localhost:1208->localhost:51090 (ESTABLISHED) telnet 20375 ubuntu 3u IPv4 473747 0t0 TCP localhost:51090->localhost:1208 (ESTABLISHED)
We found that compared to the previous results, there are 2 more results. These 3 results are:
*:1208 (LISTEN) is the name of the server’s listening socket file and the process it belongs to. The pid is 20007
localhost:1208->localhost:51090 (ESTABLISHED) is a new socket established by the server for the client. It is responsible for communicating with the client. The process pid is 20007
localhost:51090->localhost:1208 (ESTABLISHED) is a new socket established by the client for the server. It is responsible for communicating with the server. The process pid is 20375
在/proc/pid/fd/
文件路径下可以看到server和client新建的socket,这里不做赘述。从第3条结果我们可以看出,前2条socket,LISTEN socket和新建的ESTABLISHED socket都属于server进程,对于每条链接server进程都会创建一个新的socket去链接client,这条socket的源IP和源端口为server的IP和端口,目的IP和目的端口是client的IP和端口。相应的client也创建一条新的socket,该socket的源IP和源端口与目的IP和目的端口恰好与server创建的socket相反,client的端口为一个主机随机分配的高位端口。
从上面的结果我们可以回答一个问题 “服务端socket.accept后,会产生新端口吗”? 答案是不会。server的监听端口不会变,server为client创建的新的socket的端口也不会变,在本例中都是1208。这难到不会出现端口冲突吗?当然不会,我们知道socket是通过5维数组[协议,本地IP,本地端口,远程IP,远程端口] 来唯一确定的。socket: *:1208 (LISTEN)和socket: localhost:1208->localhost:51090 (ESTABLISHED)是不同的socket 。那这个LISTEN socket有什么用呢?我的理解是当收到请求连接的数据包,比如TCP的SYN请求,那么这个连接会被LISTEN socket接收,进行accept处理。如果是已经建立过连接后的客户端数据包,则将数据放入接收缓冲区。这样,当服务器端需要读取指定客户端的数据时,则可以利用ESTABLISHED套接字通过recv或者read函数到缓冲区里面去取指定的数据,这样就可以保证响应会发送到正确的客户端。
上面提到客户端主机会为发起连接的进程分配一个随机端口去创建一个socket,而server的进程则会为每个连接创建一个新的socket。因此对于客户端而言,由于端口最多只有65535个,其中还有1024个是不准用户程序用的,那么最多只能有64512个并发连接。对于服务端而言,并发连接的总量受到一个进程能够打开的文件句柄数的限制,因为socket也是文件的一种,每个socket都有一个文件描述符(FD,file descriptor),进程每创建一个socket都会打开一个文件句柄。该上限可以通过ulimt -n查看,通过增加ulimit可以增加server的并发连接上限。本例的server机器的ulimit为:
# ulimit -n 1024
上面讲了半天服务端与客户端的socket创建,现在我们来看看服务端与客户端的socket通信。还记得我们的server可以响应3个命令吗,分别是ping,echo和quit,我们来试试:
# telnet localhost 1208 Trying 127.0.0.1... Connected to localhost. Escape character is '^]'. ping pong echo Hello,socket Hello,socket quit Connection closed by foreign host.
我们可以看到client与server通过socket的通信。
到此为止,我们来总结下从telnet发起连接,到客户端发出ping,服务端响应pong,到最后客户端quit,连接断开的整个过程:
telnet发起向localhost:1208发起连接请求;
server通过socket: TCP *:1208 (LISTEN)收到请求数据包,进行accept处理;
server返回socket信息给客户端,客户端收到server socket信息,为客户端进程分配一个随机端口51090,然后创建socket: TCP localhost:51090->localhost:1208 来连接服务端;
服务端进程创建一个新的socket: TCP localhost:1208->localhost:51090来连接客户端;
客户端发出ping,ping数据包send到socket: TCP localhost:51090->localhost:1208 ;
服务端通过socket: TCP localhost:1208->localhost:51090收到ping数据包,返回pong,pong数据包又通过原路返回到客户端 ,完成一次通信。
客户端进程发起quit请求,通过上述相同的socket路径到达服务端后,服务端切断连接,服务端删除socket: TCP localhost:1208->localhost:51090释放文件句柄;客户端删除 socket: TCP localhost:51090->localhost:1208,释放端口 51090。
在上述过程中,socket到socket之间还要经过操作系统,网络栈等过程,这里就不做细致描述。
2. Unix domain socket实践
我们知道docker使用的是client-server架构,用户通过docker client输入命令,client将命令转达给docker daemon去执行。docker daemon会监听一个unix domain socket来与其他进程通信,默认路径为/var/run/docker.sock。我们来看看这个文件:
# ls -l /var/run/docker.sock srw-rw---- 1 root docker 0 Aug 31 01:19 /var/run/docker.sock
可以看到它的Linux文件类型是“s”,也就是socket。通过这个socket,我们可以直接调用docker daemon的API进行操作,接下来我们通过docker.sock调用API来运行一个nginx容器,相当于在docker client上执行:
# docker run nginx
与在docker client上一行命令搞定不同的是,通过API的形式运行容器需要2步:创建容器和启动容器。
1. 创建nginx容器,我们使用curl命令调用docker API,通过--unix-socket /var/run/docker.sock指定Unix domain socket。首先调用/containers/create,并传入参数指定镜像为nginx,如下:
# curl -XPOST --unix-socket /var/run/docker.sock -d '{"Image":"nginx"}' -H 'Content-Type: application/json' http://localhost/containers/create {"Id":"67bfc390d58f7ba9ac808d3fc948a5d4e29395e94288a7588ec3523af6806e1a","Warnings":[]}
2. 启动容器,通过上一步创建容器返回的容器id,我们来启动这个nginx:
# curl -XPOST --unix-socket /var/run/docker.sock http://localhost/containers/67bfc390d58f7ba9ac808d3fc948a5d4e29395e94288a7588ec3523af6806e1a/start
# docker container ls CONTAINER ID IMAGE COMMAND CREATED STATUS PORTS NAMES 67bfc390d58f nginx "/docker-entrypoint.…" About a minute ago Up 7 seconds 80/tcp romantic_heisenberg
至此,通过Unix domain socket我们实现了客户端进程curl与服务端进程docker daemon间的通信,并成功地调用了docker API运行了一个nginx container。
值得注意的是,在连接服务端的Unix domain socket的时候,我们直接指定的是服务端的socket文件。而在使用Internet domain socket的时候,我们指定的是服务端的IP地址和端口号。
总结
Socket是Linux跨进程通信方式的一种。它不仅仅可以做到同一台主机内跨进程通信,它还可以做到不同主机间的跨进程通信。根据通信域的不同可以划分成2种:Unix domain socket 和 Internet domain socket。
Internet domain socket根据通信协议划分成3种:流式套接字(SOCK_STREAM),数据报套接字(SOCK_DGRAM)及原始套接字
一个完整的Socket的组成应该是由[协议,本地地址,本地端口,远程地址,远程端口]组成的一个5维数组。
相关推荐:《Linux视频教程》
The above is the detailed content of what is linux socket. For more information, please follow other related articles on the PHP Chinese website!