Home  >  Article  >  Backend Development  >  Understand Gunicorn and Python GIL in one article

Understand Gunicorn and Python GIL in one article

WBOY
WBOYforward
2023-04-12 10:40:101167browse

Understand Gunicorn and Python GIL in one article

What is the Python GIL, how it works, and how it affects gunicorn.

Which Gunicorn worker type should I choose for production environment?

Python has a global lock (GIL) which only allows one thread to run (i.e. interpret bytecode). In my opinion, understanding how Python handles concurrency is essential if you want to optimize your Python services.

Python and gunicorn give you different ways to handle concurrency, and since there is no magic bullet that covers all use cases, it's a good idea to understand the options, tradeoffs, and advantages of each option.

Gunicorn worker types

Gunicorn exposes these different options under the concept of "workers types". Each type is suitable for a specific set of use cases.

  • sync——Fork the process into N processes running in parallel to handle the request.
  • gthread——Generate N threads to concurrently serve requests.
  • eventlet/gevent - Spawn green threads to serve concurrent requests.

Gunicorn sync worker

This is the simplest type of job where the only concurrency option is to fork N processes that will serve requests in parallel.

They can work well, but they incur a lot of overhead (such as memory and CPU context switching), and if most of your request time is waiting for I/O, scaling Sex is bad.

Gunicorn gthread worker

gthread worker improves on this by allowing you to create N threads per process. This improves I/O performance because you can run more instances of your code simultaneously. This is the only one of the four affected by GIL.

Gunicorn eventlet and gevent workers

eventlet/gevent workers attempt to further improve the gthread model by running lightweight user threads (aka green threads, greenlets, etc.).

#This allows you to have thousands of said greenlets at very little cost compared to system threads. Another difference is that it follows a collaborative work model rather than a preemptive one, allowing uninterrupted work until they block. We will first analyze the behavior of the gthread worker thread when processing requests and how it is affected by the GIL.

Unlike sync where each request is served directly by one process, with gthread, each process has N threads to scale better without spawning multiple processes s expenses. Since you are running multiple threads in the same process, the GIL will prevent them from running in parallel.

GIL is not a process or special thread. It is just a boolean variable whose access is protected by a mutex, which ensures that only one thread is running within each process. How it works can be seen in the picture above. In this example we can see that we have 2 system threads running concurrently, each thread handling 1 request. The process is like this:

  • Thread A holds GIL and starts serving the request.
  • After a while, thread B tries to serve the request but cannot hold the GIL.
  • B Set a timeout to force the GIL to be released if this does not happen before the timeout is reached.
  • A The GIL will not be released until the timeout is reached.
  • B sets the gil_drop_request flag to force A to release the GIL immediately.
  • A releases the GIL and will wait until another thread grabs the GIL, to avoid a situation where A keeps releasing and grabbing the GIL without other threads being able to grab it.
  • B Start running.
  • B Releases the GIL while blocking I/O.
  • A starts running.
  • B Tried to run again but was suspended.
  • A completes before the timeout is reached.
  • B Running completed.

Same scenario but using gevent

Understand Gunicorn and Python GIL in one article

Another option to increase concurrency without using processes is to use greenlets. This worker spawns "user threads" instead of "system threads" to increase concurrency.

While this means that they are not affected by the GIL, it also means that you still cannot increase the parallelism because they cannot be scheduled by the CPU in parallel.

  • Greenlet A will start running until an I/O event occurs or execution is completed.
  • Greenlet B will wait until Greenlet A releases the event loop.
  • A is over.
  • BStart.
  • B releases the event loop to wait for I/O.
  • B completed.

For this case, it is obvious that having a greenlet type worker is not ideal. We end up having the second request wait until the first request completes and then idle waiting for I/O again.

In these scenarios, the greenlet collaboration model really shines because you don't waste time on context switches and avoid the overhead of running multiple system threads.

We will witness this in the benchmark test at the end of this article. Now, this begs the following question:

  • Will changing the thread context switch timeout affect service latency and throughput?
  • How to choose between gevent/eventlet and gthread when you mix I/O and CPU work.
  • How to use gthread worker to select the number of threads.
  • Should I just use sync workers and increase the number of forked processes to avoid the GIL?

To answer these questions, you need to monitor to collect the necessary metrics and then run tailored benchmarks against those same metrics. There is no use running synthetic benchmarks that have zero correlation to your actual usage patterns. The graph below shows latency and throughput metrics for different scenarios to give you an idea of ​​how it all works together.

Benchmarking the GIL switching interval

Understand Gunicorn and Python GIL in one articleHere we can see how changing the GIL thread switching interval/timeout affects request latency. As expected, IO latency gets better as the switching interval decreases. This happens because CPU-bound threads are forced to release the GIL more frequently and allow other threads to complete their work.

But this is not a panacea. Reducing the switch interval will make CPU-bound threads take longer to complete. We can also see an increase in overall latency and a decrease in timeouts due to the increased overhead of constant thread switching. If you want to try it yourself, you can change the switching interval using the following code:

Understand Gunicorn and Python GIL in one article

Benchmarking gthread vs. gevent latency using CPU bound requests

Understand Gunicorn and Python GIL in one article

Overall, we can see that the benchmarks reflect our intuition from our previous analysis of how GIL-bound threads and greenlets work.

Gthread has better average latency for IO-bound requests because the switching interval forces long-running threads to release.

gevent CPU-bound requests have better latency than gthreads because they are not interrupted to service other requests.

Benchmarking gthread vs. gevent throughput using CPU bound requests

Understand Gunicorn and Python GIL in one article

The results here also reflect our previous comparison of gevent vs. gthread Intuition for better throughput. These benchmarks are highly dependent on the type of work being done and may not necessarily translate directly to your use case.

The main goal of these benchmarks is to give you some guidance on what to test and measure in order to maximize each CPU core that will serve requests.

Since all gunicorn workers allow you to specify the number of processes that will run, what changes is how each process handles concurrent connections. Therefore, make sure to use the same number of workers to make the test fair. Let's now try to answer the previous question using the data collected from our benchmark.

Will changing the thread context switch timeout affect service latency and throughput?

Indeed. However, for the vast majority of workloads, it's not a game changer.

How to choose between gevent/eventlet and gthread when you are mixing I/O and CPU work? As we can see, ghtread tends to allow better concurrency when you have more CPU-intensive work.

How to choose the number of threads for gthread worker?

As long as your benchmarks can simulate production-like behavior, you will clearly see peak performance and then it will start to degrade due to too many threads.

Should I just use sync workers and increase the number of forked processes to avoid the GIL?

Unless your I/O is almost zero, scaling with just processes is not the best option.

Conclusion

Coroutines/Greenlets can improve CPU efficiency because they avoid interrupts and context switches between threads. Coroutines trade latency for throughput.

Coroutines can cause more unpredictable latencies if you mix IO and CPU-bound endpoints - CPU-bound endpoints are not interrupted to service other incoming requests. If you take the time to configure gunicorn correctly, the GIL is not a problem.

The above is the detailed content of Understand Gunicorn and Python GIL in one article. For more information, please follow other related articles on the PHP Chinese website!

Statement:
This article is reproduced at:51cto.com. If there is any infringement, please contact admin@php.cn delete