Nginx event processing mechanism: For a basic web server, there are usually three types of events, network events, signals, and timers.
First, let’s look at the basic process of a request: establishing a connection---receiving data---sending data.
Look at the operations at the bottom of the system again: The above process (establishing a connection---receiving data---sending data) is a read and write event at the bottom of the system.
1) If the blocking call method is used, when the read and write events are not ready, the read and write events will inevitably not be able to be carried out, and the read and write events can only be carried out after waiting for the event to be ready. Then the request will be delayed. Blocking calls will enter the kernel and wait, and the CPU will be used by others. For single-threaded workers, it is obviously not suitable. When there are more network events, everyone is waiting, and no one uses the CPU when it is idle. CPU utilization Naturally, the rate cannot go up, let alone high concurrency.
2) Since blocking calls are not possible if you are not ready, then use non-blocking method. Non-blocking means that the event returns to EAGAIN immediately, telling you that the event is not ready yet. Why are you panicking? Come back later. Okay, after a while, check the event again until the event is ready. During this period, you can do other things first, and then check whether the event is ready. Although it is no longer blocking, you have to check the status of the event from time to time. You can do more things, but the overhead is not small.
Summary: Non-blocking determines whether by constantly checking the status of the event. Performing read and write operations brings a lot of overhead.
3) Therefore, there is an asynchronous and non-blocking event processing mechanism. Specific to system calls are system calls like select/poll/epoll/kqueue. They provide a mechanism that allows you to monitor multiple events at the same time. Calling them is blocking, but you can set a timeout. Within the timeout, if an event is ready, it will return. This mechanism solves our two problems above.
Take epoll as an example: when the event is not ready, it is put into epoll (queue). If an event is ready, then process it; if the event returns EAGAIN, then continue to put it into epoll. Therefore, as long as an event is ready, we will handle it, and only when it is not ready at all times, we will wait in epoll. In this way, we can handle a large number of concurrent requests. Of course, the concurrent requests here refer to unprocessed requests. There is only one thread, so of course there is only one request that can be processed at the same time. We just continuously switch between requests. That's it, the switch was voluntarily given up because the asynchronous event was not ready. There is no cost to switching here. You can understand it as processing multiple prepared events in a loop, which is actually the case.
4) Comparison with multi-threading:
Compared with multi-threading, this event processing method has great advantages. There is no need to create threads, each request occupies very little memory, there is no context switching, and event processing is very fast. lightweight. No matter how many concurrencies there are, it will not lead to unnecessary waste of resources (context switching).
Summary: Through the asynchronous non-blocking event processing mechanism, Nginx realizes that the process processes multiple prepared events in a loop, thereby achieving high concurrency and lightweight.
Nginx features: 1. Cross-platform: Nginx can be compiled and run on most Unix like OS, and there are also ported versions for Windows.
2. The configuration is extremely simple and very easy to get started. The configuration style is the same as program development, god-like configuration
3. Non-blocking, high-concurrency connection: When data is copied, the first stage of disk I/O is non-blocking. The official test can support 50,000 concurrent connections, and in the actual production environment, the number of concurrent connections reaches 20,000 to 30,000. (This is due to Nginx using the latest epoll model)
4. Event-driven: The communication mechanism uses the epoll model, supporting Greater concurrent connections.
5. There is no need for a long connection between the nginx proxy and the back-end web server;
6. Receiving user requests is asynchronous, that is, all user requests are received first, and then sent to the back-end web server at once, which greatly alleviates the problem. Pressure on the end-end web server
7. When sending a response message, it receives data from the back-end web server and sends it to the client
8. Low network dependence. NGINX has very little dependence on the network. In theory, load balancing can be implemented as long as it can be pinged, and it can effectively distinguish between intranet and external network traffic
9. Support server detection. NGINX can detect whether the server has failed based on the status code and timeout information returned by the application server when processing the page, and promptly returns the incorrect request to resubmit it to other nodes.
master/worker structure: A master process generates a or Multiple worker processes
Low memory consumption: Processing large concurrent requests consumes very little memory. Under 30,000 concurrent connections, the 10 Nginx processes started consume only 150M of memory (15M*10=150M). Low cost: Nginx is open source software and can be used for free. Purchasing hardware load balancing switches such as F5 BIG-IP and NetScaler will cost more than 100,000 to hundreds of thousands of yuan. Built-in health check function: If a web server in the backend of Nginx Proxy goes down, front-end access will not be affected.
Save bandwidth: Supports GZIP compression and can add headers cached locally by the browser.
High stability: used for reverse proxy, the probability of downtime is very small
nginx works in a multi-process manner. Of course, nginx also supports multi-threading, but our mainstream method is still the multi-process method. The default mode of nginx. There are many benefits to nginx using a multi-process approach.
(1) After nginx is started, there will be a master process and multiple worker processes. The master process is mainly used to manage worker processes, including: receiving signals from the outside world, sending signals to each worker process, monitoring the running status of the worker process, and automatically restarting a new worker process when the worker process exits (under abnormal circumstances). . Basic network events are handled in the worker process. Multiple worker processes are peer-to-peer. They compete equally for requests from clients, and each process is independent of each other. A request can only be processed in one worker process, and a worker process cannot process requests from other processes.
The number of worker processes can be set. Generally, we will set it to be consistent with the number of CPU cores of the machine. The reason for this is inseparable from the process model and event processing model of nginx.
(2) How does the Master handle the signal after receiving it (./nginx -s reload)?
First, after receiving the signal, the master process will reload the configuration file, then start a new process, and send all old Processes send a signal that they can retire honorably. After the new process is started, it begins to receive new requests, while the old process will no longer receive new requests after receiving the signal from the master, and after all unprocessed requests in the current process are processed. , and then exit.
(3) How does the worker process process the request?
We mentioned earlier that worker processes are equal, and each process has the same opportunity to process requests. When we provide http service on port 80 and a connection request comes, each process may handle the connection. How to do this? First, each worker process forks from the master process. In the master process, first establish the socket that needs to be listened, and then fork out multiple worker processes, so that each worker process can accept the socket (of course It is not the same socket, but the socket of each process will be monitored at the same IP address and port. This is allowed in the network protocol). Generally speaking, when a connection comes in, all processes accepting the socket will receive notifications, and only one process can accept the connection, while the others fail to accept. This is the so-called thundering herd phenomenon.Of course, nginx will not turn a blind eye, so nginx provides an accept_mutex. From the name, we can see that this is a shared lock added to accept. With this lock, there will only be one process connecting to accpet at the same time, so there will be no thundering herd problem. accept_mutex is a controllable option that we can turn off explicitly. It is turned on by default. When a worker process accepts the connection, it begins to read the request, parse the request, process the request, generate data, and then return it to the client, and finally disconnects the connection. This is what a complete request looks like. We can see that a request is completely processed by the worker process, and is only processed in one worker process.
(4) What are the benefits of nginx adopting this process model?
Using independent processes will not affect each other. After one process exits, other processes are still working and the service will not be interrupted. The master process will quickly restart the new worker process. Of course, if the worker process exits abnormally, there must be a bug in the program. Abnormal exit will cause all requests on the current worker to fail, but it will not affect all requests, so the risk is reduced. Of course, there are many benefits, and everyone can experience them slowly.
(5) nginx uses a multi-worker method to process requests. There is only one main thread in each worker, so the number of concurrencies that can be handled is very limited. How many workers can handle as many concurrencies, so how can it achieve high concurrency? No, this is the brilliance of nginx. nginx uses an asynchronous and non-blocking method to process requests. In other words, nginx can handle thousands of requests at the same time.
For the IIS server, each request will be exclusive For a worker thread, when the number of concurrency reaches several thousand, there will be thousands of threads processing requests at the same time. This is a big challenge for the operating system. The memory usage caused by threads is very large, and the CPU overhead caused by thread context switching is very large. Naturally, the performance cannot be improved, and these overheads are completely meaningless. . We have said before that it is recommended to set the number of workers to the number of CPU cores. It is easy to understand here. More workers will only cause processes to compete for CPU resources, resulting in unnecessary context switching. Moreover, in order to better utilize the multi-core features, nginx provides a cpu affinity binding option. We can bind a certain process to a certain core, so that the cache will not fail due to process switching
Load balancing
Load balancing technology provides a cheap, effective and transparent method on top of the existing network structure to expand the bandwidth of network devices and servers, increase throughput, Strengthen network data processing capabilities and improve network flexibility and availability. It has two meanings: first, a large amount of concurrent access or data traffic is shared to multiple node devices for separate processing, reducing the time users wait for responses; second, a single heavy-load operation is shared to multiple node devices for parallel processing. , after each node device is processed, the results are summarized and returned to the user, and the system processing capacity is greatly improved
The above introduces Nginx----event processing mechanism and process model, including aspects of the content. I hope it will be helpful to friends who are interested in PHP tutorials.