Home > Article > Backend Development > In-depth analysis of the role and usage of descriptors in Python
Generally speaking, a descriptor is an object attribute with "binding behavior", and its access control is overridden by the descriptor protocol method. These methods are __get__(), __set__(), and __delete__(). Objects with these methods are called descriptors.
The default access control for attributes is to get (get), set (set) and delete (delete) it from the object's dictionary (__dict__). For example, the search order for a.x is, a.__dict__['x'], then type(a).__dict__['x'], then find the parent class of type(a) (excluding metaclass) .If the value found is a descriptor, Python will call the descriptor's method to override the default control behavior. Where in the lookup phase this overriding occurs depends on which descriptor method is defined. Note that descriptors only work when inside a new-style class. (New-style classes are classes that inherit from type or object)
Descriptors are powerful and widely used. Descriptors are the mechanism behind properties, instance methods, static methods, class methods, and super. Descriptors are used extensively in Python itself to implement the new-style classes introduced in Python 2.2. Descriptors simplify the underlying C code and provide a flexible new set of tools for everyday programming in Python.
Descriptor Protocol
descr.__get__(self, obj, type=None) --> value descr.__get__(self, obj, value) --> None descr.__delete__(self, obj) --> None
Override the default search behavior when an object is a descriptor and is treated as an object property (very important).
If an object defines both __get__ and __set__, it is called data descriptor. A descriptor that only defines __get__ is called a non-data descriptor.
The difference between data descriptor and non-data descriptor is: the priority relative to the dictionary of the instance. If the instance dictionary has an attribute with the same name as the description device, and if the descriptor is a data descriptor, the data descriptor will be used first. If it is a non-data descriptor, the attributes in the dictionary will be used first.
class B(object): def __init__(self): self.name = 'mink' def __get__(self, obj, objtype=None): return self.name class A(object): name = B() a = A() print a.__dict__ # print {} print a.name # print mink a.name = 'kk' print a.__dict__ # print {'name': 'kk'} print a.name # print kk
Here B is a non-data descriptor, so when a.name = 'kk', there will be a name attribute in a.__dict__. Next, set it __set__
def __set__(self, obj, value): self.name = value ... do something a = A() print a.__dict__ # print {} print a.name # print mink a.name = 'kk' print a.__dict__ # print {} print a.name # print kk
Because the data descriptor has a higher priority for accessing attributes than the instance's dictionary, a.__dict__ is empty.
Descriptor call
The descriptor can be called directly like this: d.__get__(obj)
However, a more common situation is that the descriptor is automatically called when the property is accessed. For example, obj.d will look for d in the dictionary of obj. If d defines the __get__ method, then d.__get__(obj) will be called according to the following precedence rules.
The details of the call depend on whether obj is a class or an instance. In addition, descriptors only work for new-style objects and new-style classes. Classes that inherit from object are called new-style classes.
For objects, the method object.__getattribute__() turns b.x into type(b).__dict__['x'].__get__(b, type(b)) . The specific implementation is based on this priority order: data descriptors take precedence over instance variables, instance variables take precedence over non-data descriptors, and the __getattr__() method (if included in the object) has the lowest priority. The complete C language implementation can be viewed in PyObject_GenericGetAttr() in Objects/object.c.
For classes, the method type.__getattribute__() turns B.x into B.__dict__['x'].__get__(None, B) . To describe it in Python is:
def __getattribute__(self, key): "Emulate type_getattro() in Objects/typeobject.c" v = object.__getattribute__(self, key) if hasattr(v, '__get__'): return v.__get__(None, self) return v
A few important points:
Note: In Python 2.2, if m is a descriptor, super(B, obj).m() will only call the method __get__() . In Python 2.3, non-data descriptors (unless it is an old-style class) will also be called. The implementation details of super_getattro() are in: Objects/typeobject.c, [del] An equivalent Python implementation is in Guido's Tutorial [/del] (Translator's Note: The original sentence has been deleted and is retained for your reference).
The above shows that the mechanism of the descriptor is implemented in the __getattribute__() method of object, type, and super. Classes derived from object automatically inherit this mechanism, or they have a metaclass with a similar mechanism. Likewise, a class's __getattribute__() method can be overridden to turn off the descriptor behavior for this class.
描述器例子
下面的代码中定义了一个资料描述器,每次 get 和 set 都会打印一条消息。重写 __getattribute__() 是另一个可以使所有属性拥有这个行为的方法。但是,描述器在监视特定属性的时候是很有用的。
class RevealAccess(object): """A data descriptor that sets and returns values normally and prints a message logging their access. """ def __init__(self, initval=None, name='var'): self.val = initval self.name = name def __get__(self, obj, objtype): print 'Retrieving', self.name return self.val def __set__(self, obj, val): print 'Updating' , self.name self.val = val >>> class MyClass(object): x = RevealAccess(10, 'var "x"') y = 5 >>> m = MyClass() >>> m.x Retrieving var "x" 10 >>> m.x = 20 Updating var "x" >>> m.x Retrieving var "x" 20 >>> m.y 5
这个协议非常简单,并且提供了令人激动的可能。一些用途实在是太普遍以致于它们被打包成独立的函数。像属性(property), 方法(bound和unbound method), 静态方法和类方法都是基于描述器协议的。