Home  >  Article  >  Backend Development  >  Learn about the garbage collector in CPython in one article

Learn about the garbage collector in CPython in one article

WBOY
WBOYforward
2022-10-12 15:32:351569browse

This article brings you relevant knowledge about Python, which mainly introduces related issues about CPython. CPython’s garbage collector is Python’s built-in method to solve the circular reference problem. Let’s take a look at it together, I hope it will be helpful to everyone.

Learn about the garbage collector in CPython in one article

[Related recommendations: Python3 video tutorial ]

Garbage collector in CPython

Garbage in CPython The collector (GC for short) is Python's built-in method to solve the circular reference problem. By default, it always runs in the background and works its magic every once in a while, so you don't have to worry about circular references clogging up your memory.

The garbage collector is designed to find and remove circularly referenced objects from CPython's working memory. It does this in the following way.

  • Detect circularly referenced objects

  • Call the final __del__ method

  • It starts from each Remove pointers from objects (to solve the loop problem), only if the loop is still orphaned after step 2

After this process is completed, every The object now has a reference count of 0, so the object will be deleted from memory.

Although it works automatically, we can actually import it as a module from the standard library. For example:

import gc

Detecting circular references

CPython's garbage collector tracks various objects that exist in memory--but not all objects. We can instantiate some objects and see if the garbage collector will collect them.

>>> gc.is_tracked("a string")
False
>>> gc.is_tracked(["a", "list"])
True

If an object can contain pointers, this gives it the ability to form part of a circular reference structure - and this is exactly what garbage detectors exist for, to detect and tear down. In Python such objects are often called "container objects".

So, the garbage collector needs to know about any objects that may exist as part of a circular reference. Strings cannot, so "a string" will not be tracked by the garbage collector. Lists (as we've seen) can contain pointers, so ['a', 'list'] is tracked.

Any instances of user-defined classes will also be tracked by the garbage collector, because we can always set arbitrary properties (pointers) on them.

>>> Wade = MyNameClass("Wade")
>>> gc.is_tracked(Wade)
True

So, the garbage collector knows all objects that may form circular references. How does it know if a circular reference has been formed?

It also knows all the pointers in each object, and where they point. We can see this action.

>>> my_list = ["a", "list"]
>>> gc.get_referents(my_list)
['list', 'a']

The get_referents method (also known as the traversal method) receives an object and returns a list of object pointers (its references) it contains. So, the above list contains pointers to each of its elements, which are strings.

Let's look at the get_referents method in a loop of objects (although not yet a circular reference, since the objects can still be accessed from the namespace).

>>> jane = MyNamedClass("Jane")
>>> bob = MyNamedClass("Bob")
>>> jane.friend = bob
>>> bob.friend = jane
>>> gc.get_referents(bob)
[{&#39;name&#39;: &#39;bob&#39;, &#39;friend&#39;: <__main__.MyNamedClass object at 0x7ff29a095d60>}, <class &#39;__main__

In this loop, we can see that the object pointed to by bob contains pointers to: its attribute dictionary, containing bob's name (bob) and its friends (also pointed to by jane MyNamedClass instance). The bob object also has a pointer to the class object itself, since bob.class will return that class object.

When the garbage collector runs, it checks whether every object it knows about (that is, any object that returns True when you call gc.is_tracked) is reachable from the namespace. It does this by tracking all pointers from the namespace, and pointers into the objects that those pointers point to, and so on, until it has built up an entire view of everything that is accessible from the code.

If after doing this, the GC finds that there are some objects that cannot be reached from the namespace, then it can clear these objects.

Remember that any objects still in memory must have a non-zero reference count, otherwise they will be deleted because of the reference count. For those objects that are unreachable but still have a non-zero reference count, they must be part of a circular reference, which is why we are so concerned about the possibility of these happening.

Let's go back to the reference loop, Jane and Bob, and turn this loop into an isolation loop by removing the pointer from the namespace.

>>> del jane
>>> del bob

Now, we know the exact situation that the garbage collector solves. We can trigger manual garbage collection by calling gc.collect().

>>> gc.collect()
Deleting Bob!
Deleting Jane!
4

By default, the garbage collector will automatically perform this action every once in a while (because more and more objects are created and destroyed while CPython is running).

In the above code snippet, the output we see contains the print statement from the __del__ method of MyNamClass, with a number at the end -- in this case, 4. This number is output by the garbage collector itself and tells us how many objects were removed.

【Related recommendations: Python3 video tutorial

The above is the detailed content of Learn about the garbage collector in CPython in one article. For more information, please follow other related articles on the PHP Chinese website!

Statement:
This article is reproduced at:juejin.im. If there is any infringement, please contact admin@php.cn delete