Home > Article > Backend Development > ThreadLocal variables in Python
Werkzeug As a WSGI tool library, due to some considerations, it does not directly use python's built-in ThreadLocal class, but implements a series of Local classes by itself. Including simple Local, as well as LocalStack, LocalManager and LocalProxy implemented on this basis. Next, let's take a look at how these classes are used, the original intention of their design, and specific implementation techniques.
Design of the Local class
The designer of Werkzeug believes that the ThreadLocal that comes with python cannot meet the needs, mainly for the following two reasons:
Werkzeug mainly uses "ThreadLocal" to meet the concurrency requirements, and the ThreadLocal that comes with python only Able to achieve thread-based concurrency. There are many other concurrency methods in python, such as common coroutines (greenlets), so it is necessary to implement a Local object that can support coroutines.
WSGI does not guarantee that a new thread will be generated every time to handle the request, which means that the thread can be reused (a thread pool can be maintained to handle the request). In this way, if werkzeug uses python's own ThreadLocal, an "unclean (storing data related to previously processed requests)" thread will be used to process new requests.
In order to solve these two problems, the Local class is implemented in werkzeug. Local objects can isolate data between threads and coroutines. In addition, they also support cleaning data under a certain thread or coroutine (so that after processing a request, the corresponding data can be cleaned up and then wait for the next request. s arrival).
How to implement it specifically? The idea is actually very simple. We mentioned it at the end of the article "In-depth Understanding of ThreadLocal Variables in Python (Part 1)", which is to create a global dictionary and then use the thread (or coroutine) identifier as the key. , the local data of the corresponding thread (or coroutine) is used as value. Here werkzeug is implemented according to the above ideas, but uses some black magic of python, and finally provides users with a clear and simple interface.
Specific implementation
The implementation of the Local class is in werkzeug.local, and the code of version 8a84b62 is used for analysis. Through the understanding of ThreadLocal in the first two articles, we already know the characteristics and usage of Local objects. So here we no longer give examples of using Local objects, let’s look at the code directly.
class Local(object): __slots__ = ('__storage__', '__ident_func__') def __init__(self): object.__setattr__(self, '__storage__', {}) object.__setattr__(self, '__ident_func__', get_ident) ...
Since there may be a large number of Local objects, in order to save the space occupied by Local objects, __slots__ is used here to hard-code the attributes that Local can have:
__storage__: The value is a dictionary used to save actual data, initialized as Empty;
__ident_func__: The value is a function used to find the identifier of the current thread or coroutine.
Since the actual data of the Local object is stored in __storage__, operations on the Local attribute are actually operations on __storage__. For obtaining attributes, the magic method __getattr__ is used to intercept attribute acquisition other than __storage__ and __ident_func__, and direct it to the data of the current thread or coroutine stored in __storage__. As for the set or del of attribute values, they are implemented using __setattr__ and __setattr__ respectively (see attribute control for an introduction to these magic methods). The key code is as follows:
def __getattr__(self, name): try: return self.__storage__[self.__ident_func__()][name] except KeyError: raise AttributeError(name) def __setattr__(self, name, value): ident = self.__ident_func__() storage = self.__storage__ try: storage[ident][name] = value except KeyError: storage[ident] = {name: value} def __delattr__(self, name): try: del self.__storage__[self.__ident_func__()][name] except KeyError: raise AttributeError(name)
Suppose we have N threads or coroutines with IDs 1, 2, ..., N. Each uses a Local object to save some of its own local data. Then the contents of the Local object As shown in the figure below:
In addition, the Local class also provides the __release_local__ method to release the data saved by the current thread or coroutine.
Local extension interface
Werkzeug implements LocalStack and LocalManager based on Local to provide more friendly interface support.
LocalStack
LocalStack implements a thread (or coroutine) independent stack structure by encapsulating Local. There are specific usage methods in the comments. A simple usage example is as follows
ls = LocalStack() ls.push(12) print ls.top # 12 print ls._local.__storage__ # {140735190843392: {'stack': [12]}}
The implementation of LocalStack is quite interesting. It will The Local object is used as its own attribute _local, and then the interface push, pop and top methods are defined to perform corresponding stack operations. The list _local.__storage__._local.__ident_func__() is used here to simulate the stack structure. In the interface push, pop and top, the operation of the stack is simulated by operating this list. It should be noted that when obtaining this list inside the interface function, it does not need to be as complicated as the boldface above. You can directly use the getattr() method of _local. That is Can. Taking the push function as an example, the implementation is as follows:
def push(self, obj): """Pushes a new item to the stack""" rv = getattr(self._local, 'stack', None) if rv is None: self._local.stack = rv = [] rv.append(obj) return rv
The implementation of pop and top is similar to that of a general stack, both of which perform corresponding operations on the stack = getattr(self._local, 'stack', None) list. In addition, LocalStack also allows us to customize __ident_func__. Here we use the built-in function property to generate a descriptor, encapsulate the get and set operations of __ident_func__, and provide an attribute value __ident_func__ as an interface. The specific code is as follows:
def _get__ident_func__(self): return self._local.__ident_func__ def _set__ident_func__(self, value): object.__setattr__(self._local, '__ident_func__', value) __ident_func__ = property(_get__ident_func__, _set__ident_func__) del _get__ident_func__, _set__ident_func__
LocalManager
Local and LocalStack are single objects independent of threads or coroutines. Many times we need a thread or coroutine independent container to organize multiple Local or LocalStack objects (just like we use a list to organize multiple int Or the same string type).
Werkzeug实现了LocalManager,它通过一个list类型的属性locals来存储所管理的Local或者LocalStack对象,还提供cleanup方法来释放所有的Local对象。Werkzeug中LocalManager最主要的接口就是装饰器方法make_middleware,代码如下:
def make_middleware(self, app): """Wrap a WSGI application so that cleaning up happens after request end. """ def application(environ, start_response): return ClosingIterator(app(environ, start_response), self.cleanup) return application
这个装饰器注册了回调函数cleanup,当一个线程(或者协程)处理完请求之后,就会调用cleanup清理它所管理的Local或者LocalStack 对象(ClosingIterator 的实现在 werkzeug.wsgi中)。下面是一个使用 LocalManager 的简单例子:
from werkzeug.local import Local, LocalManager local = Local() local_2 = Local() local_manager = LocalManager([local, local2]) def application(environ, start_response): local.request = request = Request(environ) ... # application 处理完毕后,会自动清理local_manager 的内容
通过LocalManager的make_middleware我们可以在某个线程(协程)处理完一个请求后,清空所有的Local或者LocalStack对象,这样这个线程又可以处理另一个请求了。至此,文章开始时提到的第二个问题就可以解决了。Werkzeug.local 里面还实现了一个 LocalProxy 用来作为Local对象的代理,也非常值得去学习。