Home  >  Article  >  Operation and Maintenance  >  How to solve Ulimit faults

How to solve Ulimit faults

WBOY
WBOYforward
2023-05-16 15:39:091526browse

I encountered a very interesting problem recently. There is a group of HAProxy that has frequent problems. Log in to the server and check the CPU, memory, network, and IO. It was eventually discovered that there were more than 60,000 connections in the machine's TIME_WAIT state.

How to solve Ulimit faults

TIME_WAIT state generally appears on proxy machines such as HAProxy and Nginx, mainly caused by frequent active shutdowns. By modifying the reuse and recycling parameters, the problem can be solved relatively quickly.

The statistics of network status can be calculated using the following command.

netstat -ant|awk '/^tcp/ {++S[$NF]} END {for(a in S) print (a,S[a])}' ESTABLISHED 70 FIN_WAIT2 30 CLOSING 33 TIME_WAIT 65520

How to solve Ulimit faults

There is nothing magical about this, but the number 65535 is too sensitive. It should have triggered some kind of upper limit.

What makes us even more confused is: Why is the service unavailable when the connections in the TIME_WAIT state only reach 65535?

Are the claims of millions of connections per machine being bragging? ?Why can’t you stand the trouble?

65535, which means equal to 2 to the 16th power minus one, is a magical number. Putting this small number aside for the moment, let's first understand how much connection capacity Linux supports.

1. How many connections can Linux support?

The answer is countless. But there are only 65535 ports.

Why are there only 65535 ports?

TCP and UDP protocols use 16 bits at the beginning to store the source port number and destination port number respectively. This is based on historical reasons. Unfortunately, this value is of type short and the size is also 2^16-1.

The unchangeable standards caused by historical reasons are so deeply rooted.

How many connections can Linux support? The answer is countless.

Take nginx as an example, we monitor it on port 80. At this time, machine A connects to Nginx and can initiate up to 60,000 long connections. If machine B connects to Nginx, it can also initiate 60,000 multiple connections. This is because determining a connection is determined by src and dst.

The idea that Linux can only accept 65535 connections can only be said to be a very superficial assumption.

65535 ports may be too small for you as a stress tester. But for servers, it's more than enough.

2. How to support millions of connections?

As you can see from the above, there is no limit to the number of connections. But Linux has another layer of protection, which is the number of file handles. Those things viewed through the lsof command are so-called file handles.

Let’s take a look at the display of several commands.

ulmit, shows the number of file handles that each process can occupy.

ulimit -n 65535

file-max, shows the total number of file handles that the operating system can occupy, for all processes.

cat /proc/sys/fs/file-max 766722

file-nr, shows the number of handles currently used and the total number of handles. Can be used for monitoring.

cat /proc/sys/fs/file-nr 1824  0 766722

In order to support millions of connections, operating system level handles and process level handles need to be released. In other words, the display of ulimit and file-max must be greater than one million.

3. How to set it?

Although a commonly used solution is ulimit to set the number of process handles, I highly recommend it not. Only processes started in the same shell will be affected by the ulimit setting, for no other reason. If you open another shell or reboot the machine, the ulimit changes will disappear. This is the following method:

ulimit -n 1000000

The correct way is to modify the /etc/security/limits.conf file. For example, the following content.

root soft nofile 1000000 root hard nofile 1000000 * soft nofile 1000000 * hard nofile 1000000

As you can see, we can modify the number of handles for a specific user. This is often encountered when installing applications such as es.

es  -  nofile  65535

Using this method, you still need to open a new shell to operate. This command will not take effect either in the modified shell or in the shell before modification. xjjdog has encountered multiple cases where problems still occurred despite restrictions being lifted.

View the process's memory mapped file to determine whether these changes have taken effect. For example, in the command "cat /proc/180323/limits", detailed information will be displayed.

This value is not as large as you want to set it. Its upper limit of size is determined by nr_open. To increase the size, change the value of fs.nr_open in /ect/sysct.conf.

cat /proc/sys/fs/nr_open 1048576

If you want to modify the file-max parameter, it is recommended to add the following content to the /etc/sysctl.conf file. There are more than 6 million!

fs.file-max = 6553560

When the number of files exceeds, the error kernel: VFS: file-max limit 65535 reached will be reported.

in conclusion.

How to solve Ulimit faults

# Even if Linux opens a port, it can accept a large number of connections. The upper limit of these connections is limited by the number of file handles in a single process and the number of file handles in the operating system, that is, ulimit and file-max.

In order to persist parameter modifications, we tend to write the changes to files. The file handle limit of the process can be placed in /etc/security/limits.conf, and its upper limit is restricted by fs.nr_open; the file handle limit of the operating system can be placed in the /etc/sysctl.conf file. Finally, be sure to check the /proc/$id/limits file to confirm whether the modification has taken effect in the process.

The above is the detailed content of How to solve Ulimit faults. For more information, please follow other related articles on the PHP Chinese website!

Statement:
This article is reproduced at:yisu.com. If there is any infringement, please contact admin@php.cn delete