Why GIL is needed
GIL is essentially a lock. Students who have studied operating systems know that locks are introduced to avoid data inconsistencies caused by concurrent access. There are many global variables defined outside functions in CPython, such as usable_arenas and usedpools in memory management. If multiple threads apply for memory at the same time, these variables may be modified at the same time, causing data confusion. In addition, Python's garbage collection mechanism is based on reference counting. All objects have an ob_refcnt field that indicates how many variables currently reference the current object. Operations such as variable assignment and parameter passing will increase the reference count. Exiting the scope or returning from the function will reduce the references. count. Similarly, if multiple threads modify the reference count of the same object at the same time, it is possible that ob_refcnt is different from the real value, which may cause a memory leak. Objects that will not be used will not be recycled. In more serious cases, the object may not be recycled. The referenced object caused the Python interpreter to crash.
Implementation of GIL
The definition of GIL in CPython is as follows
struct _gil_runtime_state { unsigned long interval; // 请求 GIL 的线程在 interval 毫秒后还没成功,就会向持有 GIL 的线程发出释放信号 _Py_atomic_address last_holder; // GIL 上一次的持有线程,强制切换线程时会用到 _Py_atomic_int locked; // GIL 是否被某个线程持有 unsigned long switch_number; // GIL 的持有线程切换了多少次 // 条件变量和互斥锁,一般都是成对出现 PyCOND_T cond; PyMUTEX_T mutex; // 条件变量,用于强制切换线程 PyCOND_T switch_cond; PyMUTEX_T switch_mutex; };
The most essential thing is the locked field protected by mutex, which indicates whether the GIL is currently held. The other fields are for Used to optimize the GIL. When a thread applies for GIL, it calls the take_gil() method, and when it releases GIL, it calls the drop_gil() method. In order to avoid starvation, when a thread waits for interval milliseconds (default is 5 milliseconds) and has not applied for GIL, it will actively send a signal to the thread holding GIL, and the GIL holder will check the signal at the appropriate time. , if it is found that other threads are applying, the GIL will be forcibly released. The appropriate timing mentioned here is different in different versions. In the early days, it was checked every 100 instructions. In Python 3.10.4, it was checked at the end of the conditional statement, the end of each loop body of the loop statement, and the end of the function call. It will be checked when the time comes.
The function take_gil() that applies for GIL is simplified as follows
static void take_gil(PyThreadState *tstate) { ... // 申请互斥锁 MUTEX_LOCK(gil->mutex); // 如果 GIL 空闲就直接获取 if (!_Py_atomic_load_relaxed(&gil->locked)) { goto _ready; } // 尝试等待 while (_Py_atomic_load_relaxed(&gil->locked)) { unsigned long saved_switchnum = gil->switch_number; unsigned long interval = (gil->interval >= 1 ? gil->interval : 1); int timed_out = 0; COND_TIMED_WAIT(gil->cond, gil->mutex, interval, timed_out); if (timed_out && _Py_atomic_load_relaxed(&gil->locked) && gil->switch_number == saved_switchnum) { SET_GIL_DROP_REQUEST(interp); } } _ready: MUTEX_LOCK(gil->switch_mutex); _Py_atomic_store_relaxed(&gil->locked, 1); _Py_ANNOTATE_RWLOCK_ACQUIRED(&gil->locked, /*is_write=*/1); if (tstate != (PyThreadState*)_Py_atomic_load_relaxed(&gil->last_holder)) { _Py_atomic_store_relaxed(&gil->last_holder, (uintptr_t)tstate); ++gil->switch_number; } // 唤醒强制切换的线程主动等待的条件变量 COND_SIGNAL(gil->switch_cond); MUTEX_UNLOCK(gil->switch_mutex); if (_Py_atomic_load_relaxed(&ceval2->gil_drop_request)) { RESET_GIL_DROP_REQUEST(interp); } else { COMPUTE_EVAL_BREAKER(interp, ceval, ceval2); } ... // 释放互斥锁 MUTEX_UNLOCK(gil->mutex); }
In order to ensure atomicity, the entire function body needs to apply for and release the mutex lock gil->mutex at the beginning and end respectively. If the current GIL is idle, get the GIL directly. If it is not idle, wait for the condition variable gil->cond interval milliseconds (not less than 1 millisecond). If it times out and no GIL switching occurs during the period, set gil_drop_request to request forced switching. The GIL holds the thread, otherwise it continues to wait. Once the GIL is successfully obtained, the values of gil->locked, gil->last_holder and gil->switch_number need to be updated, the condition variable gil->switch_cond must be awakened, and the mutex lock gil->mutex must be released.
The function drop_gil() that releases GIL is simplified as follows
static void drop_gil(struct _ceval_runtime_state *ceval, struct _ceval_state *ceval2, PyThreadState *tstate) { ... if (tstate != NULL) { _Py_atomic_store_relaxed(&gil->last_holder, (uintptr_t)tstate); } MUTEX_LOCK(gil->mutex); _Py_ANNOTATE_RWLOCK_RELEASED(&gil->locked, /*is_write=*/1); // 释放 GIL _Py_atomic_store_relaxed(&gil->locked, 0); // 唤醒正在等待 GIL 的线程 COND_SIGNAL(gil->cond); MUTEX_UNLOCK(gil->mutex); if (_Py_atomic_load_relaxed(&ceval2->gil_drop_request) && tstate != NULL) { MUTEX_LOCK(gil->switch_mutex); // 强制等待一次线程切换才被唤醒,避免饥饿 if (((PyThreadState*)_Py_atomic_load_relaxed(&gil->last_holder)) == tstate) { assert(is_tstate_valid(tstate)); RESET_GIL_DROP_REQUEST(tstate->interp); COND_WAIT(gil->switch_cond, gil->switch_mutex); } MUTEX_UNLOCK(gil->switch_mutex); } }
First release the GIL under the protection of gil->mutex, and then wake up other threads that are waiting for the GIL. In a multi-CPU environment, the current thread has a higher probability of reacquiring the GIL after releasing the GIL. In order to avoid starving other threads, the current thread needs to be forced to wait for the condition variable gil->switch_cond. It can only obtain the GIL when other threads Only then will the current thread be awakened.
Some notes
GIL optimization
Code subject to GIL constraints cannot be executed in parallel, which reduces the overall performance. In order to minimize the performance loss, Python does not perform IO operations or not When intensive CPU calculations involving object access occur, the GIL will be actively released, reducing the granularity of the GIL, such as
reading and writing files
Network access
Encrypted data/Compressed data
So strictly speaking, in the case of a single process, multiple Python threads may be accessed simultaneously Execution, for example, one thread is running normally and another thread is compressing data.
The consistency of user data cannot rely on GIL
GIL is a lock generated to maintain the consistency of internal variables of the Python interpreter. The consistency of user data is not responsible for GIL. Although GIL also ensures the consistency of user data to a certain extent. For example, instructions that do not involve jumps and function calls in Python 3.10.4 will be executed atomically under the constraints of GIL, but the consistency of data in business logic The user needs to lock it himself to ensure it.
The following code uses two threads to simulate the user's collection of fragments and winning awards
from threading import Thread def main(): stat = {"piece_count": 0, "reward_count": 0} t1 = Thread(target=process_piece, args=(stat,)) t2 = Thread(target=process_piece, args=(stat,)) t1.start() t2.start() t1.join() t2.join() print(stat) def process_piece(stat): for i in range(10000000): if stat["piece_count"] % 10 == 0: reward = True else: reward = False if reward: stat["reward_count"] += 1 stat["piece_count"] += 1 if __name__ == "__main__": main()
Assuming that the user can get a reward every time he collects 10 fragments, and each thread has collected 10,000,000 fragments, it should be 9999999 rewards were obtained (the last time was not calculated), a total of 20000000 fragments should be collected, and 1999998 rewards were obtained, but the results of the first run on my computer were as follows
{'piece_count': 20000000, 'reward_count': 1999987}
The total number of fragments is consistent with expectations, but the number of rewards But there are 12 missing. The number of pieces is correct because in Python 3.10.4, stat["piece_count"] = 1 is executed atomically under GIL constraints. Since the execution thread may be switched at the end of each loop, it is possible that thread t1 will increase piece_count to 100 at the end of a certain loop, but before the next loop starts to judge modulo 10, the Python interpreter switches to thread t2 for execution, and t2 will increase piece_count. If you reach 101, you will miss a reward.
Attachment: How to avoid being affected by GIL
Having said so much, if I don’t talk about the solution, it is just a popular science post, but it is useless. GIL is so bad, is there a way around it? Let’s take a look at what solutions are available.
Use multiprocess to replace Thread
The emergence of the multiprocess library is largely to make up for the inefficiency of the thread library due to GIL. It completely replicates a set of interfaces provided by thread to facilitate migration. The only difference is that it uses multiple processes instead of multiple threads. Each process has its own independent GIL, so there will be no GIL contention between processes.
Of course multiprocess is not a panacea. Its introduction will increase the difficulty of data communication and synchronization between time threads in the program. Take the counter as an example. If we want multiple threads to accumulate the same variable, for thread, declare a global variable and wrap three lines with the thread.Lock context. In multiprocess, since the processes cannot see each other's data, they can only declare a Queue in the main thread, put and then get, or use shared memory. This additional implementation cost makes coding multi-threaded programs, which is already very painful, even more painful. Where are the specific difficulties? Interested readers can further read this article
Use other parsers
As mentioned before, since GIL is only a product of CPython, are other parsers better? ? Yes, parsers like JPython and IronPython do not require the help of the GIL due to the nature of their implementation languages. However, by using Java/C# for the parser implementation, they also lost the opportunity to take advantage of the community's many useful features of the C language module. So these parsers have always been relatively niche. After all, everyone will choose the former over function and performance in the early stage. Done is better than perfect.
So it’s hopeless?
Of course, the Python community is also working very hard to continuously improve the GIL, and even try to remove the GIL. And there have been a lot of improvements in each minor version. Interested readers can further read this Slide
Another improvement Reworking the GIL
– Change the switching granularity from based on opcode counting to based on time slice counting
&ndash ; Prevent the thread that recently released the GIL lock from being scheduled again immediately
– Added thread priority function (high-priority threads can force other threads to release the GIL lock they hold)
The above is the detailed content of What is the GIL in Python. For more information, please follow other related articles on the PHP Chinese website!

Python and C each have their own advantages, and the choice should be based on project requirements. 1) Python is suitable for rapid development and data processing due to its concise syntax and dynamic typing. 2)C is suitable for high performance and system programming due to its static typing and manual memory management.

Choosing Python or C depends on project requirements: 1) If you need rapid development, data processing and prototype design, choose Python; 2) If you need high performance, low latency and close hardware control, choose C.

By investing 2 hours of Python learning every day, you can effectively improve your programming skills. 1. Learn new knowledge: read documents or watch tutorials. 2. Practice: Write code and complete exercises. 3. Review: Consolidate the content you have learned. 4. Project practice: Apply what you have learned in actual projects. Such a structured learning plan can help you systematically master Python and achieve career goals.

Methods to learn Python efficiently within two hours include: 1. Review the basic knowledge and ensure that you are familiar with Python installation and basic syntax; 2. Understand the core concepts of Python, such as variables, lists, functions, etc.; 3. Master basic and advanced usage by using examples; 4. Learn common errors and debugging techniques; 5. Apply performance optimization and best practices, such as using list comprehensions and following the PEP8 style guide.

Python is suitable for beginners and data science, and C is suitable for system programming and game development. 1. Python is simple and easy to use, suitable for data science and web development. 2.C provides high performance and control, suitable for game development and system programming. The choice should be based on project needs and personal interests.

Python is more suitable for data science and rapid development, while C is more suitable for high performance and system programming. 1. Python syntax is concise and easy to learn, suitable for data processing and scientific computing. 2.C has complex syntax but excellent performance and is often used in game development and system programming.

It is feasible to invest two hours a day to learn Python. 1. Learn new knowledge: Learn new concepts in one hour, such as lists and dictionaries. 2. Practice and exercises: Use one hour to perform programming exercises, such as writing small programs. Through reasonable planning and perseverance, you can master the core concepts of Python in a short time.

Python is easier to learn and use, while C is more powerful but complex. 1. Python syntax is concise and suitable for beginners. Dynamic typing and automatic memory management make it easy to use, but may cause runtime errors. 2.C provides low-level control and advanced features, suitable for high-performance applications, but has a high learning threshold and requires manual memory and type safety management.


Hot AI Tools

Undresser.AI Undress
AI-powered app for creating realistic nude photos

AI Clothes Remover
Online AI tool for removing clothes from photos.

Undress AI Tool
Undress images for free

Clothoff.io
AI clothes remover

Video Face Swap
Swap faces in any video effortlessly with our completely free AI face swap tool!

Hot Article

Hot Tools

MantisBT
Mantis is an easy-to-deploy web-based defect tracking tool designed to aid in product defect tracking. It requires PHP, MySQL and a web server. Check out our demo and hosting services.

Dreamweaver Mac version
Visual web development tools

SublimeText3 Mac version
God-level code editing software (SublimeText3)

PhpStorm Mac version
The latest (2018.2.1) professional PHP integrated development tool

WebStorm Mac version
Useful JavaScript development tools