Later, I discovered through tracking that the occurrence of this kind of situation is closely related to PHP's file_get_contents() function.
In large and medium-sized websites, API interface calls based on the HTTP protocol are commonplace. PHP programmers like to use the simple and convenient file_get_contents("http://example.com/") function to obtain the returned content of a URL. However, if the website http://example.com/ responds slowly, file_get_contents(" ) will always be stuck there and will not time out.
We know that in php.ini, there is a parameter max_execution_time that can set the maximum execution time of PHP scripts. However, in php-cgi (php-fpm), this parameter will not take effect. What can really control the maximum execution time of a PHP script is the following parameter in the php-fpm.conf configuration file: The timeout (in seconds) for serving a single request after which the worker process will be terminated
Should be used when 'max_execution_time ' ini option does not stop script execution for some reason
'0s' means 'off'
0s
The default value is 0 seconds, that is Say, the PHP script will continue to execute. In this way, when all php-cgi processes are stuck in the file_get_contents() function, this Nginx+PHP WebServer can no longer handle new PHP requests, and Nginx will return "502 Bad Gateway" to the user. Modifying this parameter is necessary to set the maximum execution time of a PHP script, but it only treats the symptoms rather than the root cause. For example, if it is changed to 30s, if file_get_contents() is slow to obtain web page content, this means that 150 php-cgi processes can only handle 5 requests per second, and it is also difficult for WebServer to avoid "502 Bad Gateway".
To achieve a complete solution, we can only let PHP programmers get rid of the habit of using file_get_contents("http://example.com/") directly, but slightly modify it, add a timeout, and use the following method To implement HTTP GET request. If you find it troublesome, you can encapsulate the following code into a function yourself.
Copy code The code is as follows:
$ctx = stream_context_create(array(
' http' => array(
'timeout' => 1 //Set a timeout in seconds
)
)
);
file_get_contents("http:// example.com/", 0, $ctx);
?>
Of course, this is not the only reason that causes the php-cgi process CPU to be 100%. So, how to determine it? Is it caused by the file_get_contents() function?
First, use the top command to check the php-cgi process with high CPU usage.
Copy code The code is as follows:
top - 10:34:18 up 724 days, 21:01, 3 users, load average: 17.86, 11.16, 7.69
Tasks: 561 total, 15 running, 546 sleeping, 0 stopped, 0 zombie
Cpu(s): 5.9%us, 4.2%sy, 0.0%ni, 89.4%id , 0.2%wa, 0.0%hi, 0.2%si, 0.0%st
Mem: 8100996k total, 4320108k used, 3780888k free, 772572k buffers
Swap: 8193108k total, 50776k used, 8142332k free, 412088k cached
PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND
10747 www 18 0 360m 22m 12m R 100.6 0.3 0:02.60 php-cgi
10709 www 16 0 359m 28m 17m R 96.8 0.4 0:11.34 php-cgi
10745 www 18 0 360m 24m 14m R 94.8 0.3 0:39.51 php-cgi
10707 www 18 0 360m 25m 14m S 77.4 0.3 0:33.48 php-cgi
10782 www 20 0 360m 26m 15m R 75.5 0.3 0:10.93 php-cgi
10708 www 25 0 360m 22m 12m R 69.7 0.3 0:45.16 php-cgi
10683 www 25 0 362m 28m 15m R 54.2 0.4 0: 32.65 php-cgi
10711 www 25 0 360m 25m 15m R 52.2 0.3 0:44.25 php-cgi
10688 www 25 0 359m 25m 15m R 38.7 0.3 0:10.44 php-cgi
10719 www 25 0 360m 26m 16m R 7.7 0.3 0 :40.59 php-cgi
Find the PID of one of the php-cgi processes with 100% CPU, and use the following command to track it:
Copy codeThe code is as follows:
strace -p 10747
select(7, [6], [6], [], {15, 0}) = 1 (out [6], left {15, 0})
poll([{fd=6, events=POLLIN}], 1, 0) = 0 (Timeout)
select(7, [6], [6], [], {15, 0}) = 1 (out [6], left {15, 0})
poll([{fd=6, events=POLLIN}], 1, 0) = 0 (Timeout)
select(7, [6], [6], [], {15, 0}) = 1 (out [6], left {15, 0})
poll([{fd=6, events=POLLIN}], 1, 0) = 0 (Timeout)
select(7, [6], [6], [], {15, 0}) = 1 (out [6], left {15, 0})
poll([{fd=6, events=POLLIN}], 1, 0) = 0 (Timeout)
select(7, [6], [6], [], {15, 0}) = 1 (out [6], left {15, 0})
poll([{fd=6, events=POLLIN}], 1, 0) = 0 (Timeout)
select(7, [6], [6], [], {15, 0}) = 1 (out [6], left {15, 0})
poll([{fd=6, events=POLLIN}], 1, 0) = 0 (Timeout)
select(7, [6], [6], [], {15, 0}) = 1 (out [6], left {15, 0})
poll([{fd=6, events=POLLIN}], 1, 0) = 0 (Timeout)
select(7, [6], [6], [], {15, 0}) = 1 (out [6], left {15, 0})
poll([{fd=6, events=POLLIN}], 1, 0) = 0 (Timeout)
select(7, [6], [6], [], {15, 0}) = 1 (out [6], left {15, 0})
poll([{fd=6, events=POLLIN}], 1, 0) = 0 (Timeout)
select(7, [6], [6], [], {15, 0}) = 1 (out [6], left {15, 0})
poll([{fd=6, events=POLLIN}], 1, 0) = 0 (Timeout)
那么,就可以确定是 file_get_contents() 导致的问题了。
http://www.bkjia.com/PHPjc/324141.htmlwww.bkjia.comtruehttp://www.bkjia.com/PHPjc/324141.htmlTechArticle后来,我通过跟踪发现,这类情况的出现,跟 PHP 的 file_get_contents() 函数有着密切的关系。 大、中型网站中,基于 HTTP 协议的 API 接口调用,...