Home >Backend Development >Python Tutorial >Detailed introduction to Python garbage collection mechanism
PythonThe default garbage collection mechanism is "reference counting", and each object maintains an ob_ref field. Its advantage is that its mechanism is simple. When a new reference points to the object, the reference count is increased by 1. When the reference of an object is destroyed, it is decreased by 1. Once the reference count of the object is 0, the object is immediately recycled and the memory occupied will be released. Its disadvantage is that it requires extra space to maintain reference counts, but the main problem is that it cannot solve "cyclicreferences".
What is a circular reference? A and B refer to each other and there is no external reference to either A or B. Although their reference counts are both 1, they should obviously be recycled. Example:
a = { } # a 的引用为 1 b = { } # b 的引用为 1 a['b'] = b # b 的引用增 1,b的引用为2 b['a'] = a # a 的引用增 1,a的引用为 2 del a # a 的引用减 1,a的引用为 1 del b # b 的引用减 1, b的引用为 1
In this example, the del statement is reduced The reference count of a and b is deleted and the variable name used for reference is deleted. However, since the two objects each contain a reference to the other object, although the last two objects cannot be accessed by name, the reference The count does not decrease to zero. Therefore, this object will not be destroyed, it will always reside in memory, which causes a memory leak. In order to solve the circular reference problem, Python introduced two GC mechanisms: mark-sweep and generational collection.
Mark-Sweep (Mark-Sweep) is a garbage collection algorithm based on tracking (Tracing) recycling technology. Objects are connected through references (pointers). Together, a directed graph is formed, the objects constitute the nodes of this directed graph, and the reference relationships constitute the edges of this directed graph. Starting from the root object (root object), traverse the object along the directed edge . The reachable objects are marked as useful objects, and the unreachable objects are the objects to be cleared. The so-called root objects are some global reference objects and references in the function stack. The objects referenced by these references cannot be deleted.
The mark removal algorithm, as Python's auxiliary garbage collection technology, mainly deals with some container objects, such as list, dict, tuple, instance, etc., because for string , it is impossible for numerical objects to cause circular reference problems. Python uses a doubly linked list to organize these container objects.
Generational recycling is an operation method that exchanges space for time. Python divides the memory into different collections based on the survival time of the object. Each collection is called a generation. , Python divides the memory into 3 "generations", namely the young generation (0th generation), the middle generation (1st generation), and the old generation (2nd generation). They correspond to 3 linked lists, and their garbage collection frequencies Decreases as the object's survival time increases. Newly created objects will be allocated in the young generation. When the total number of young generation linked lists reaches the upper limit, the Python garbage collection mechanism will be triggered to recycle those objects that can be recycled, and those objects that will not be recycled will be moved to Go to the middle age, and so on. The objects in the old age are the objects that have survived the longest, even within the lifecycle of the entire system. At the same time, generational recycling is based on mark-and-sweep technology.
Generational recycling also serves as Python's auxiliary garbage collection technology to process those container objects
The above is the detailed content of Detailed introduction to Python garbage collection mechanism. For more information, please follow other related articles on the PHP Chinese website!