Home >Backend Development >PHP Tutorial >Nginx working principle, optimization, and vulnerabilities.
1. Nginx modules and working principles
Nginx consists of the kernel and modules. Among them, the design of the kernel is very small and concise, and the work done is also very simple. Just find the configuration file to connect the client to the client. The request is mapped to a location block (location is a directive in the Nginx configuration, used for URL matching), and each directive configured in this location will start different modules to complete the corresponding work.
Nginx modules are structurally divided into core modules, basic modules and third-party modules:
Core modules: HTTP module, EVENT module and MAIL module
Basic modules: HTTP Access module, HTTP FastCGI module, HTTP Proxy module and HTTP Rewrite module,
Third-party modules: HTTP Upstream Request Hash module, Notice module and HTTP Access Key module.
Modules developed by users according to their own needs are third-party modules. It is precisely with the support of so many modules that Nginx's functions are so powerful.
Nginx modules are functionally divided into the following three categories.
Handlers (processor modules). This type of module directly processes requests and performs operations such as outputting content and modifying header information. There is generally only one Handlers processor module.
Filters (filter module). This type of module mainly modifies the content output by other processor modules, and is finally output by Nginx.
Proxies (proxy class module). Such modules are modules such as Nginx's HTTP Upstream. These modules mainly interact with some back-end services such as FastCGI to implement functions such as service proxy and load balancing.
Figure 1-1 shows the normal HTTP request and response process of the Nginx module. Figure 1-1 shows the normal HTTP request and response process of the Nginx module.
Nginx actually does very little work. When it receives an HTTP request, it just maps the request to a location block by looking for the configuration file, and each instruction configured in this location will be started. Different modules complete the work, so the modules can be regarded as the real labor workers of Nginx. Usually instructions in a location involve a handler module and multiple filter modules (of course, multiple locations can reuse the same module). The handler module is responsible for processing requests and completing the generation of response content, while the filter module processes the response content.
In terms of working mode, Nginx is divided into two modes: single worker process and multi-worker process. In the single worker process mode, in addition to the main process, there is also a worker process, and the worker process is single-threaded; in the multi-worker process mode, each worker process contains multiple threads. Nginx defaults to single worker process mode.
2. Nginx+FastCGI operating principle
1. What is FastCGI
FastCGI is a scalable and high-speed communication interface between HTTP server and dynamic scripting language. Most popular HTTP servers support FastCGI, including Apache, Nginx, lighttpd, etc. At the same time, FastCGI is also supported by many scripting languages, including PHP.FastCGI is developed and improved from CGI. The main disadvantage of the traditional CGI interface method is poor performance, because every time the HTTP server encounters a dynamic program, the script parser needs to be restarted to perform parsing, and then the results are returned to the HTTP server. This is almost unusable when dealing with high concurrent access. In addition, the traditional CGI interface method has poor security and is rarely used now. FastCGI interface mode adopts C/S structure, which can separate the HTTP server and the script parsing server, and start one or more script parsing daemons on the script parsing server. Every time the HTTP server encounters a dynamic program, it can be delivered directly to the FastCGI process for execution, and then the result is returned to the browser. This method allows the HTTP server to exclusively process static requests or return the results of the dynamic script server to the client, which greatly improves the performance of the entire application system.
2. Nginx+FastCGI operating principleNginx does not support direct calling or parsing of external programs. All external programs (including PHP) must be called through the FastCGI interface. The FastCGI interface is a socket under Linux (this socket can be a file socket or an ip socket).
wrapper: In order to call a CGI program, a FastCGI wrapper is also needed (a wrapper can be understood as a program used to start another program). This wrapper is bound to a fixed socket, such as a port or file socket. When Nginx sends a CGI request to this socket, the wrapper receives the request through the FastCGI interface, and then Fork (derives) a new thread. This thread calls the interpreter or external program to process the script and read the return data; then, The wrapper then passes the returned data to Nginx through the FastCGI interface and along the fixed socket; finally, Nginx sends the returned data (html page or picture) to the client. This is the entire operation process of Nginx+FastCGI, as shown in Figure 1-3.先 首, we first need a Wrapper, this wrapper needs to complete:
The function of the FastCGI (library) is called It is a function implemented internally by fastcgi and is non-transparent to the wrapper) Scheduling threads, performing fork and killCommunicating with application (php)
3. spawn-fcgi and PHP-FPM The FastCGI interface method is in the script One or more daemon processes are started on the parsing server to parse dynamic scripts. These processes are the FastCGI process manager, or FastCGI engine. spawn-fcgi and PHP-FPM are two FastCGI process managers that support PHP.
Thus HTTPServer is completely liberated and can respond better and handle concurrency better. N w SPAWN-FCGI and PHP-FPM similarities:
Compared to Spawn-FCGI, PHP-FPM has better CPU and memory control, and the former crashes easily and must be monitored with crontab, while PHP-FPM does not have such troubles.
The main advantage of FastCGI is to separate dynamic languages from HTTP Server, so Nginx and PHP/PHP-FPM are often deployed on different servers to share the pressure on the front-end Nginx server, so that Nginx can exclusively handle static requests and forwarding Dynamic requests, and the PHP/PHP-FPM server exclusively parses PHP dynamic requests.
4. Nginx+PHP-FPM
PHP-FPM is a manager for managing FastCGI. It exists as a plug-in for PHP. When you install PHP and want to use PHP-FPM, you need to install the old version of php (php5. 3.3), you need to install PHP-FPM into PHP in the form of a patch, and PHP must be consistent with the PHP-FPM version, which is a must)
PHP-FPM is actually a patch of the PHP source code, designed to FastCGI process management is integrated into the PHP package. It must be patched into your PHP source code, and it can be used after compiling and installing PHP.PHP-FPM’s default configuration php-fpm.conf:
start_servers
Min_spare_servers
Max_spare_servers
Configure Nginx to run php: Edit nginx.conf and add the following statement:
1) FastCGI process manager php-fpm initializes itself, starts the main process php-fpm and starts the start_servers CGI child processes.
The main process php-fpm mainly manages the fastcgi sub-process and listens to port 9000.
The fastcgi child process waits for the connection from the Web Server.
2) When the client request reaches the Web Server Nginx,
Nginx
uses the location command to hand over all files with php as the suffix to 127.0.0.1:9000 for processing, that is, Nginx Use the location command to hand over all files with php as the suffix to 127.0.0.1:9000 for processing. 3) FastCGI Process Manager PHP-FPM selects and connects to a child process CGI interpreter. The web server sends CGI environment variables and standard input to the FastCGI child process. 4) After the FastCGI sub-process completes processing, it returns standard output and error information to the Web Server from the same connection. When the FastCGI child process closes the connection, the request is processed.
5). The FastCGI child process then waits for and processes the next connection from the FastCGI process manager (running in WebServer).
3. Nginx’s IO model
First of all, the event model supported by nginx is as follows (nginx’s wiki):
Nginx supports the following methods of handling connections (I/O multiplexing method), these methods It can be specified through the use directive.
Only epoll is an efficient method
"
poll.
To use epoll, you only need these three system calls: epoll_create(2), epoll_ctl(2), epoll_wait(2). It was introduced in the 2.5.44 kernel (epoll(4) is a new API introduced in Linux kernel 2.5.44) and is widely used in the 2.6 kernel. The advantages of epoll
supports a process to open a large number of
socketIO efficiency does not decrease linearly as the number of FDs increases
environment, the efficiency of epoll is far higher than that of select/poll.
This point actually involves the specific implementation of epoll. Whether it is select, poll or epoll, the kernel needs to notify the user space of the FD message. How to avoid unnecessary memory copies is very important. At this point, epoll is implemented by mmap the same memory in the user space through the kernel. And if you follow epoll from the 2.5 kernel like me, you will definitely not forget the manual mmap step.
This is actually not an advantage of epoll, but an advantage of the entire Linux platform. Maybe you can doubt the Linux platform, but you can't avoid the ability that the Linux platform gives you to fine-tune the kernel. For example, the kernel TCP/IP protocol The stack uses a memory pool to manage the sk_buff structure, so the size of this memory pool (skb_head_pool) can be dynamically adjusted during runtime - done by echo XXXX>/proc/sys/net/core/hot_list_length. Another example is the second parameter of the listen function (the length of the packet queue after TCP completes the three-way handshake), which can also be dynamically adjusted according to the memory size of your platform. We even tried the latest NAPI network card driver architecture on a special system where the number of data packets is huge but the size of each data packet itself is very small.
(epoll content, refer to epoll_Interactive Encyclopedia)
4. Nginx optimization
1. Compilation and installation process optimization
1 ). After reducing Nginx compilation The file size of
When compiling Nginx, it is performed in debug mode by default. In debug mode, a lot of tracking and ASSERT information will be inserted. After compilation is completed, one Nginx will be several megabytes. If you cancel the debug mode of Nginx before compilation, Nginx will only have a few hundred kilobytes after compilation. Therefore, you can modify the relevant source code and cancel the debug mode before compiling. The specific method is as follows:
After the Nginx source code file is decompressed, find the auto/cc/gcc file in the source code directory, and find the following lines in it:
Comment out or delete these two lines to cancel the debug mode .
2. Specify CPU type compilation optimization for a specific CPU
When compiling Nginx, the default GCC compilation parameter is "-O". To optimize GCC compilation, you can use the following two parameters:
To To determine the CPU type, you can use the following command:
2. Use TCMalloc to optimize the performance of Nginx
TCMalloc’s full name is Thread-Caching Malloc, which is a member of the open source tool google-perftools developed by Google. Compared with the standard glibc library's Malloc, the TCMalloc library is much more efficient and fast in memory allocation, which greatly improves the performance of the server in high concurrency situations, thereby reducing the load on the system. The following is a brief introduction on how to add TCMalloc library support to Nginx.
To install the TCMalloc library, you need to install the two software packages libunwind (32-bit operating systems do not need to be installed) and google-perftools. The libunwind library provides basic function call chains and function call registers for programs based on 64-bit CPUs and operating systems. Function. The following describes the specific operation process of using TCMalloc to optimize Nginx.
1). Install the libunwind library
You can download the corresponding libunwind version from http://download.savannah.gnu.org/releases/libunwind. The one downloaded here is libunwind-0.99-alpha.tar.gz. The installation process is as follows:
2). To install google-perftools
you can download the corresponding google-perftools version from http://google-perftools.googlecode.com, the one downloaded here is google-perftools-1.8. tar.gz. The installation process is as follows:
At this point, the installation of google-perftools is completed.
3). Recompile Nginx
In order for Nginx to support google-perftools, you need to add the “–with-google_perftools_module” option during the installation process to recompile Nginx. The installation code is as follows:
Here Nginx installation is completed.
4). Add a thread directory for google-perftools
Create a thread directory and place the file under /tmp/tcmalloc. The operation is as follows:
5). Modify the Nginx main configuration file
Modify the nginx.conf file and add the following code under the pid line:
Then, restart Nginx to complete the loading of google-perftools.
6). Verify the running status
In order to verify that google-perftools has been loaded normally, you can view it through the following command:
Since the value of worker_processes is set to 4 in the Nginx configuration file, 4 Nginx threads are opened , each thread will have one row of records. The numerical value after each thread file is the pid value of started Nginx.
At this point, the operation of optimizing Nginx using TCMalloc is completed.
3. Nginx kernel parameter optimization
The optimization of kernel parameters is mainly the optimization of system kernel parameters for Nginx applications in Linux systems.
An optimization example is given below for reference.
Add the above kernel parameter values to the /etc/sysctl.conf file, and then execute the following command to make it effective:
下面对实例中选项的含义进行介绍:
net.ipv4.tcp_max_tw_buckets选项用来设定timewait的数量,默认是180 000,这里设为6000。
net.ipv4.ip_local_port_range选项用来设定允许系统打开的端口范围。
net.ipv4.tcp_tw_recycle选项用于设置启用timewait快速回收。
net.ipv4.tcp_tw_reuse选项用于设置开启重用,允许将TIME-WAIT sockets重新用于新的TCP连接。
net.ipv4.tcp_syncookies选项用于设置开启SYN Cookies,当出现SYN等待队列溢出时,启用cookies进行处理。
net.core.somaxconn选项的默认值是128, 这个参数用于调节系统同时发起的tcp连接数,在高并发的请求中,默认的值可能会导致链接超时或者重传,因此,需要结合并发请求数来调节此值。
net.core.netdev_max_backlog选项表示当每个网络接口接收数据包的速率比内核处理这些包的速率快时,允许发送到队列的数据包的最大数目。
net.ipv4.tcp_max_orphans选项用于设定系统中最多有多少个TCP套接字不被关联到任何一个用户文件句柄上。如果超过这个数字,孤立连接将立即被复位并打印出警告信息。这个限制只是为了防止简单的DoS攻击。不能过分依靠这个限制甚至人为减小这个值,更多的情况下应该增加这个值。
net.ipv4.tcp_max_syn_backlog选项用于记录那些尚未收到客户端确认信息的连接请求的最大值。对于有128MB内存的系统而言,此参数的默认值是1024,对小内存的系统则是128。
net.ipv4.tcp_synack_retries参数的值决定了内核放弃连接之前发送SYN+ACK包的数量。
net.ipv4.tcp_syn_retries选项表示在内核放弃建立连接之前发送SYN包的数量。
net.ipv4.tcp_fin_timeout选项决定了套接字保持在FIN-WAIT-2状态的时间。默认值是60秒。正确设置这个值非常重要,有时即使一个负载很小的Web服务器,也会出现大量的死套接字而产生内存溢出的风险。
net.ipv4.tcp_syn_retries选项表示在内核放弃建立连接之前发送SYN包的数量。
如果发送端要求关闭套接字,net.ipv4.tcp_fin_timeout选项决定了套接字保持在FIN-WAIT-2状态的时间。接收端可以出错并永远不关闭连接,甚至意外宕机。
net.ipv4.tcp_fin_timeout的默认值是60秒。需要注意的是,即使一个负载很小的Web服务器,也会出现因为大量的死套接字而产生内存溢出的风险。FIN-WAIT-2的危险性比FIN-WAIT-1要小,因为它最多只能消耗1.5KB的内存,但是其生存期长些。
net.ipv4.tcp_keepalive_time选项表示当keepalive启用的时候,TCP发送keepalive消息的频度。默认值是2(单位是小时)。
4. PHP-FPM的优化
如果您高负载网站使用PHP-FPM管理FastCGI,这些技巧也许对您有用:
1)增加FastCGI进程数
把PHP FastCGI子进程数调到100或以上,在4G内存的服务器上200就可以建议通过压力测试获取最佳值。
2)增加 PHP-FPM打开文件描述符的限制
标签rlimit_files用于设置PHP-FPM对打开文件描述符的限制,默认值为1024。这个标签的值必须和Linux内核打开文件数关联起来,例如,要将此值设置为65 535,就必须在Linux命令行执行“ulimit -HSn 65536”。
然后 增加 PHP-FPM打开文件描述符的限制:
# vi /path/to/php-fpm.conf
找到“
重启 PHP-FPM.
3)适当增加max_requests
标签max_requests指明了每个children最多处理多少个请求后便会被关闭,默认的设置是500。
5. Nginx的php漏洞
Vulnerability introduction: nginx is a high-performance web server that is widely used. Not only is it often used as a reverse proxy, it can also support the operation of PHP very well. 80sec discovered that there is a serious security issue. By default, the server may incorrectly parse any type of file in PHP. This will cause serious security issues and allow malicious attackers to compromise nginx that supports PHP. server.
Vulnerability analysis: nginx supports php running in cgi mode by default. For example, in the configuration file, you can use
<br>location ~ .php$ {<br>root html;<br>fastcgi_pass 127.0.0.1:9000;<br>fastcgi_index index.php;<br>fastcgi_param SCRIPT_FILENAME /scripts$fastcgi_script_name;<br>include fastcgi_params;<br>}<br>
supports parsing of php. When location is selected for the request, the URI environment variable is used for selection, which is passed to the backend Fastcgi The key variable SCRIPT_FILENAME is determined by $fastcgi_script_name generated by nginx, and through analysis, we can see that $fastcgi_script_name is directly controlled by the URI environment variable. This is where the problem arises. In order to better support the extraction of PATH_INFO, the cgi.fix_pathinfo option exists in the PHP configuration options. Its purpose is to extract the real script name from SCRIPT_FILENAME.
So assuming there is http://www.80sec.com/80sec.jpg, we access it in the following way
http://www.80sec.com/80sec.jpg/80sec.php
will get a URI <br>/80sec.jpg/80sec.php<br>
After the location command, the request will be handed over to the back-end fastcgi for processing, and nginx will set the environment variable SCRIPT_FILENAME for it, with the content
<br>/scripts/80sec.jpg/80sec.php<br>
In other webservers such as lighttpd, we found that SCRIPT_FILENAME is correctly set to <br>/scripts/80sec.jpg<br>
So this does not exist question.
When the back-end fastcgi receives this option, it will decide whether to perform additional processing on SCRIPT_FILENAME based on the fix_pathinfo configuration. Generally, if fix_pathinfo is not set, it will affect applications that use PATH_INFO for routing selection, so this option is generally configured to be turned on. After passing this option, Php will search for the real script file name. The search method is also to check whether the file exists. At this time, SCRIPT_FILENAME and PATH_INFO will be separated into <br>/scripts/80sec.jpg and 80sec.php respectively<br>
Finally , using /scripts/80sec.jpg as the script that needs to be executed for this request, the attacker can let nginx use PHP to parse any type of file.
POC: Visit a site where nginx supports PHP. Add /80sec.php after any resource file such as robots.txt. At this time, you can see the following difference:
Visit http://www. 80sec.com/robots.txt<br>HTTP/1.1 200 OK<br>Server: nginx/0.6.32<br>Date: Thu, 20 May 2010 10:05:30 GMT<br>Content-Type: text/plain<br>Content-Length: 18<br>Last-Modified: Thu, 20 May 2010 06:26:34 GMT<br>Connection: keep-alive<br>Keep-Alive: timeout=20<br>Accept-Ranges: bytes<br>
Visit http://www.80sec. com/robots.txt/80sec.php<br>HTTP/1.1 200 OK<br>Server: nginx/0.6.32<br>Date: Thu, 20 May 2010 10:06:49 GMT<br>Content-Type: text/html<br>Transfer- Encoding: chunked<br>Connection: keep-alive<br>Keep-Alive: timeout=20<br>X-Powered-By: PHP/5.2.6<br>
The change in Content-Type illustrates the change in the backend responsible for parsing, the site There may be loopholes.
Vulnerability vendor: http://www.nginx.org
Solution:
We have tried to contact the official, but before that you can reduce the loss by the following methods<br>Close cgi.fix_pathinfo to 0<br>
Or <br>if ( $fastcgi_script_name ~ ..*/.*php ) {<br>return 403;<br>}<br>
PS: Thanks to laruence Daniel for his help during the analysis process
The above introduces the working principle, optimization and loopholes of Nginx. , including relevant content, I hope it will be helpful to friends who are interested in PHP tutorials.