Introduction to website architecture:
Many companies now use lnmp or lamp to build their website server architecture. We are all familiar with the service architecture of various websites; the performance based on nginx is better than that of Apache. At this stage, many companies are gradually replacing Apache with nginx. After all, nginx comes with high-availability configuration, reverse proxy, etc. The functions are quite outstanding.
The Lnmp website server architecture is actually the linx nginx mysql php architecture system. I won’t go into details about the architecture installation. Next, let’s talk about a case I encountered
Case:
One day, a colleague in the background said that background access was very slow, and sometimes 502 appeared. mistake. Then I reported it back to my technical superiors, and then found me to handle it (I was drinking tea at the time), and then I knew that I had something to do again.
Analysis:
Then I went directly to the colleague and asked if it was due to the network. I also called other colleagues to visit, but it still showed that the visit was busy. question. At this point I knew things were not that simple. The company uses the lnmp website server architecture and has not done much optimization before. Next we need to optimize the service architecture of the website
1. Case analysis.
We can think that since the access is slow and sometimes direct access cannot be done, it was no problem before, but now suddenly there is a problem, it must be caused by the failure of our nginx and php to respond. , the reason may be caused by the huge increase in the number of user connections to websites with other domain names. Then we can find the root cause of the problem, solve it and optimize it. Then rely on your own experience and Baidu to solve the problem.
2. Problem solving and process analysis
1、NginxOptimization:
1, check the log of nginx and find the error
`$` cat `/usr/local/nginx/logs/error.log` | grep `error`
No error found, normal
Check the access.logs of the background domain name
$ cat /var/log/access_nging.log | grep error
(The picture was not captured in time, the log was flushed, and the log was made locally Cut it and delete it regularly)
I found that the error message can be found in the log log, and there are more than a dozen 502 errors. Found the problem.
2, problem analysis and nginxOptimization
1, nginx open file limit caused.
1) First, we thought of the possible cause of the problem of nginx opening files, and increased the number of open files in nginx.
Enter the nginx configuration file and found that the number of open files was 4096. As expected, open The number of files is not adjusted optimally, which may be due to this reason. We need to change 4096 to 51200; save and reload nginx
vim /usr/local/nginx/conf/nginx.conf worker_rlimit_nofile 51200; events { worker_connections 51200; } service nginx relaod
2), Linux system file restrictions
We have changed the open file configuration of nginx, which may not be useful. We need to take a look at the system Limit the number of open files
ulimit –n
We can see that the number of open files in the system is also 4096. Next, we change the number of open files in the system and make the configuration permanent.
Enter the configuration file
vim /etc/security/limits.conf
Change parameters:
- soft nofile 65535
- hard nofile 65535
- soft nproc 65535
- hard nproc 65535
Note: The system limit can be changed at will. I just need it to be larger than the number of open files in nginx.
3、nginx's fastcgi connection time is too short .
Generally, nginx responds to php and calls it through the FastCGI interface, so fastcgi parameter configuration is very important. Every time the HTTP server encounters a dynamic program, it can be directly delivered to the FastCGI process for execution. , and then return the obtained results to the browser, and many PHP web pages use dynamic programs. Therefore, the configuration of fastcgi also plays a vital role. So this is an integral part of optimization.
Enter the nginx.conf configuration file
vim /usr/local/nginx/conf/nginx.conf
Change the time of fastcgi’s connect, send, and read parameters into 300, the configuration is as follows:
Reload nginx
service nginx reload
4, access the domain name test.
Re-visited the domain name and found that the web page had been loaded. After visiting it several times, I found that the access was still a bit slow, although the access was stable. At this point, we can point the problem to PHP and continue with the next step of PHP optimization.
2, Php optimization:
1, check php log
First of all, we need to do the same as nginx, we need to check the log first.
tail -n 100 /usr/local/php/var/log/php-fpm.log
In the log we can find that a warning appears in the php log
WARNING : [pool www] seems busy (you may need to increase pm.start_servers, or pm.min/max_spare_servers)
From the meaning of the alarm, we know that there is an alarm in php, and we are asked to increase php. The value of pm.start_servers, or pm.min/max_spare_servers.
2、原因分析
首先我们,看到日志只是出现这个警告,证明还不是很严重,至于为什么出现源码交易这个警告,接下来我们一起分析一下。
首先我们很明确的知道,pm.start_servers,、pm.min/max_spare_servers在php里面是起着啥作用先,为什么会出现这个警告。我先把的以前的配置参数贴一下。
接下来我们分析一下这几个参数的作用:
参数分析:
· pm= dynamic 表示php启用的动态模式 注: php有动态和静态(static)两种工作模式,默认是动态模式。
· pm.max_children 表示静态下最大线程数
· pm.start_servers 表示动态下启动时的线程数,该参数大于pm.min_spare_servers,小于pm.max_spare_servers
· pm.min_spare_servers 表示动态下最小空闲线程数
· pm.max_spare_servers 表示动态下最大空闲线程数
工作模式:
Static模式
当工作模式设置为静态后,就只有pm.max_children项有效,即表示php-fpm工作时一直保持的线程数。
Dynamic 模式
动态模式下,与他相关的参数有pm.start_servers、pm.min_spare_servers 、pm.max_spare_servers,分别表示开启的php进程数,最小的进程数、与最大的进程数。
模式比较:
静态模式的话,比较适合一些内存比较大一点的服务器,8G及以上的,因为对于比较大内存的服务器来说,设置为静态的话会提高效率。
动态模式适合小内存机器,灵活分配进程,省内存。可以让php自动增加和减少进程数,不过动态创建回收进程对服务器也是一种消耗。
3、php参数优化
首先我们需要考虑一下问题,如何去调试参数,达到优化的目的呢,一般来说开始的时候一个php-fpm进程只占用3M左右内存,但是运行一段时间后就会上升到20-40M,这是因为PHP程序在执行完成后,或多或少会产生内存的泄露。
所以按理来说php的最大的进程数,大概是本地内存/40,因为也要考虑系统占用内存的的这种情况,我们不能直接把除处理的结果,当成的最大进程数,不然你会死翘翘的。
我的服务器是8G内存的,所以按理来说是,最大的php进程数是200左右,所以按这个参数我做了一下调整:
采用静态模式,最大进程数设为125-150之间,搞定。
重新加载php
service php-fpm relod
查看进程数:
netstat -anpo | grep php-fpm | wc -l
`128`
效果达到了
4、结果
重新访问,发现访问php页面快了很多,查看日志,没出现告警了,后台访问也好了。
3、压测
一个网站的性能好不好,承受量有多高,这个我们可以通过压测去,去获取数据,我这里简单介绍ab工具来做压测,用法如下
ab -n 1000000 -c 10000 (一个php文件)
-n参数表示 你压力测试 总量
-c参数表示 你的模拟的并发用户数
Ab压力测试工具,是apache自带的,用起来也方便,只要本地有,就可以远程测你的服务器的性能了。个人觉得还是可以了,下面是模拟一千个用户访问100000次的结果,但然你自己压测的时候,慢慢的提升参数,测试你的网站的瓶颈。