Home  >  Article  >  Backend Development  >  Understand how to send logs asynchronously to a remote server in python

Understand how to send logs asynchronously to a remote server in python

coldplay.xixi
coldplay.xixiforward
2020-10-22 18:51:552373browse

python video tutorial column explains how to asynchronously send logs to a remote server in python.

Understand how to send logs asynchronously to a remote server in python

The most common way to use logs in python is to output logs in the console and files. The logging module also provides corresponding classes. It is also very convenient to use, but sometimes we may have some needs, such as needing to send logs to the remote end, or write directly to the database. How to achieve this need?

1. StreamHandler and FileHandler

First we write a set of code that simply outputs to cmd and files

# -*- coding: utf-8 -*-"""
-------------------------------------------------
   File Name:     loger
   Description :
   Author :       yangyanxing
   date:          2020/9/23
-------------------------------------------------
"""import loggingimport sysimport os# 初始化loggerlogger = logging.getLogger("yyx")
logger.setLevel(logging.DEBUG)# 设置日志格式fmt = logging.Formatter('[%(asctime)s] [%(levelname)s] %(message)s', '%Y-%m-%d %H:%M:%S')# 添加cmd handlercmd_handler = logging.StreamHandler(sys.stdout)
cmd_handler.setLevel(logging.DEBUG)
cmd_handler.setFormatter(fmt)# 添加文件的handlerlogpath = os.path.join(os.getcwd(), 'debug.log')
file_handler = logging.FileHandler(logpath)
file_handler.setLevel(logging.DEBUG)
file_handler.setFormatter(fmt)# 将cmd和file handler添加到logger中logger.addHandler(cmd_handler)
logger.addHandler(file_handler)

logger.debug("今天天气不错")复制代码

First initialize a logger, and set its log level to DEBUG, and then initialize cmd_handler and file_handler , finally add them to the logger, run the script, and will be printed in cmd [2020-09-23 10:45:56] [DEBUG] The weather is good today and will be written to the current directory debug.log file.

2. Add HTTPHandler

If you want to send the log to the remote server when recording, you can add a HTTPHandler, in python In the standard library logging.handler, many handlers have been defined for us, some of which we can use directly. We use tornado locally to write an interface for receiving logs and print out all the received parameters.

# 添加一个httphandlerimport logging.handlers
http_handler = logging.handlers.HTTPHandler(r"127.0.0.1:1987", '/api/log/get')
http_handler.setLevel(logging.DEBUG)
http_handler.setFormatter(fmt)
logger.addHandler(http_handler)

logger.debug("今天天气不错")复制代码

The results are in the service We received a lot of information

{
    'name': [b 'yyx'],
    'msg': [b '\xe4\xbb\x8a\xe5\xa4\xa9\xe5\xa4\xa9\xe6\xb0\x94\xe4\xb8\x8d\xe9\x94\x99'],
    'args': [b '()'],
    'levelname': [b 'DEBUG'],
    'levelno': [b '10'],
    'pathname': [b 'I:/workplace/yangyanxing/test/loger.py'],
    'filename': [b 'loger.py'],
    'module': [b 'loger'],
    'exc_info': [b 'None'],
    'exc_text': [b 'None'],
    'stack_info': [b 'None'],
    'lineno': [b '41'],
    'funcName': [b ''],
    'created': [b '1600831054.8881223'],
    'msecs': [b '888.1223201751709'],
    'relativeCreated': [b '22.99976348876953'],
    'thread': [b '14876'],
    'threadName': [b 'MainThread'],
    'processName': [b 'MainProcess'],
    'process': [b '8648'],
    'message': [b '\xe4\xbb\x8a\xe5\xa4\xa9\xe5\xa4\xa9\xe6\xb0\x94\xe4\xb8\x8d\xe9\x94\x99'],
    'asctime': [b '2020-09-23 11:17:34']
}复制代码

It can be said that there is a lot of information, but it is not what we want. We just want something similar to[2020-09-23 10:45: 56] [DEBUG] The weather is good today Such a log.

logging.handlers.HTTPHandler simply sends all the information in the log to the server. As for how the server should organize it The content is completed by the server. So we can have two methods. One is to change the server code and reorganize the log content according to the passed log information. The second is to rewrite a class and let it be sent. The reformatted log content will be sent to the server.

We use the second method because this method is more flexible. The server is only used for recording. What content to send should be decided by the client. .

We need to redefine a class. We can refer to logging.handlers.HTTPHandler and rewrite an httpHandler class.

Each log class needs to rewrite emit method, the actual execution when recording logs is the emit method

class CustomHandler(logging.Handler):
    def __init__(self, host, uri, method="POST"):
        logging.Handler.__init__(self)
        self.url = "%s/%s" % (host, uri)
        method = method.upper()        if method not in ["GET", "POST"]:            raise ValueError("method must be GET or POST")
        self.method = method    def emit(self, record):
        '''
        :param record:
        :return:
        '''
        msg = self.format(record)        if self.method == "GET":            if (self.url.find("?") >= 0):
                sep = '&'
            else:
                sep = '?'
            url = self.url + "%c%s" % (sep, urllib.parse.urlencode({"log": msg}))
            requests.get(url, timeout=1)        else:
            headers = {                "Content-type": "application/x-www-form-urlencoded",                "Content-length": str(len(msg))
            }
            requests.post(self.url, data={'log': msg}, headers=headers, timeout=1)复制代码

There is a line in the above code that defines the parameters to be sentmsg = self.format(record)

This line of code indicates that the corresponding content will be returned according to the format set by the log object. Afterwards, the content is sent through the requests library. Regardless of using the get or post method, the server can receive the log normally

[2020-09-23 11:43:50] [DEBUG] The weather is nice today

3. Asynchronously sending remote logs

Now we consider a problem. When the log is sent to the remote server, if the remote server processes it very slowly, it will consume a lot of time. After a certain period of time, logging will slow down at this time

Modify the server log processing class and let it pause for 5 seconds to simulate a long processing process

async def post(self):
    print(self.getParam('log'))    await asyncio.sleep(5)
    self.write({"msg": 'ok'})复制代码

At this time we will Printing the above log

logger.debug("今天天气不错")
logger.debug("是风和日丽的")复制代码

the output obtained is

[2020-09-23 11:47:33] [DEBUG] 今天天气不错
[2020-09-23 11:47:38] [DEBUG] 是风和日丽的复制代码

We noticed that their time interval is also 5 seconds.

Then now comes the problem. It was originally just a log, but now it has become a burden that drags down the entire script, so we need to handle remote log writing asynchronously.

3.1 Use multi-thread processing

The first thing to think about is to use multiple threads to execute the log sending method

def emit(self, record):
    msg = self.format(record)    if self.method == "GET":        if (self.url.find("?") >= 0):
            sep = '&'
        else:
            sep = '?'
        url = self.url + "%c%s" % (sep, urllib.parse.urlencode({"log": msg}))
        t = threading.Thread(target=requests.get, args=(url,))
        t.start()    else:
        headers = {            "Content-type": "application/x-www-form-urlencoded",            "Content-length": str(len(msg))
        }
        t = threading.Thread(target=requests.post, args=(self.url,), kwargs={"data":{'log': msg}, "headers":headers})
        t.start()复制代码

This method can achieve the main purpose of not blocking, but Each time a log is printed, a thread needs to be opened, which is also a waste of resources. We can also use thread pools to process

3.2 Use thread pools to process

There are ThreadPoolExecutor and ProcessPoolExecutor classes in python’s concurrent.futures, which are thread pools and process pools. They are first used during initialization. Define several threads, and then let these threads handle the corresponding functions, so that you do not need to create new threads every time

Basic use of thread pool

exector = ThreadPoolExecutor(max_workers=1) # 初始化一个线程池,只有一个线程exector.submit(fn, args, kwargs) # 将函数submit到线程池中复制代码

If there are n threads in the thread pool , when the number of submitted tasks is greater than n, the excess tasks will be placed in the queue.

Modify the above emit function again

exector = ThreadPoolExecutor(max_workers=1)def emit(self, record):
    msg = self.format(record)
    timeout = aiohttp.ClientTimeout(total=6)    if self.method == "GET":        if (self.url.find("?") >= 0):
            sep = '&'
        else:
            sep = '?'
        url = self.url + "%c%s" % (sep, urllib.parse.urlencode({"log": msg}))
        exector.submit(requests.get, url, timeout=6)    else:
        headers = {            "Content-type": "application/x-www-form-urlencoded",            "Content-length": str(len(msg))
        }
        exector.submit(requests.post, self.url, data={'log': msg}, headers=headers, timeout=6)复制代码

Why only initialize a thread with only one thread here? Pool? Because in this way, it can be guaranteed that the logs in the advanced queue will be sent first. If there are multiple threads in the pool, the order is not necessarily guaranteed.

3.3 Use the asynchronous aiohttp library to send requests

The emit method in the CustomHandler class above uses requests.post to send logs. The requests themselves are blocking and running, which is why Its existence makes the script stuck for a long time, so we can replace the blocking requests library with asynchronous aiohttp to execute the get and post methods, and rewrite the emit method in a CustomHandler

class CustomHandler(logging.Handler):
    def __init__(self, host, uri, method="POST"):
        logging.Handler.__init__(self)
        self.url = "%s/%s" % (host, uri)
        method = method.upper()        if method not in ["GET", "POST"]:            raise ValueError("method must be GET or POST")
        self.method = method    async def emit(self, record):
        msg = self.format(record)
        timeout = aiohttp.ClientTimeout(total=6)        if self.method == "GET":            if (self.url.find("?") >= 0):
                sep = '&'
            else:
                sep = '?'
            url = self.url + "%c%s" % (sep, urllib.parse.urlencode({"log": msg}))            async with aiohttp.ClientSession(timeout=timeout) as session:                async with session.get(self.url) as resp:
                    print(await resp.text())        else:
            headers = {                "Content-type": "application/x-www-form-urlencoded",                "Content-length": str(len(msg))
            }            async with aiohttp.ClientSession(timeout=timeout, headers=headers) as session:                async with session.post(self.url, data={'log': msg}) as resp:
                    print(await resp.text())复制代码

At this time, the code execution crashed

C:\Python37\lib\logging\__init__.py:894: RuntimeWarning: coroutine 'CustomHandler.emit' was never awaited
  self.emit(record)
RuntimeWarning: Enable tracemalloc to get the object allocation traceback复制代码

服务端也没有收到发送日志的请求。

究其原因是由于emit方法中使用async with session.post 函数,它需要在一个使用async 修饰的函数里执行,所以修改emit函数,使用async来修饰,这里emit函数变成了异步的函数, 返回的是一个coroutine 对象,要想执行coroutine对象,需要使用await, 但是脚本里却没有在哪里调用 await emit() ,所以崩溃信息中显示coroutine 'CustomHandler.emit' was never awaited.

既然emit方法返回的是一个coroutine对象,那么我们将它放一个loop中执行

async def main():
    await logger.debug("今天天气不错")    await logger.debug("是风和日丽的")

loop = asyncio.get_event_loop()
loop.run_until_complete(main())复制代码

执行依然报错

raise TypeError('An asyncio.Future, a coroutine or an awaitable is '复制代码

意思是需要的是一个coroutine,但是传进来的对象不是。

这似乎就没有办法了,想要使用异步库来发送,但是却没有可以调用await的地方.

解决办法是有的,我们使用 asyncio.get_event_loop() 获取一个事件循环对象, 我们可以在这个对象上注册很多协程对象,这样当执行事件循环的时候,就是去执行注册在该事件循环上的协程, 我们通过一个小例子来看一下

import asyncio 

async def test(n):
    while n > 0:        await asyncio.sleep(1)
        print("test {}".format(n))
        n -= 1
    return n    
async def test2(n):
    while n >0:        await asyncio.sleep(1)
        print("test2 {}".format(n))
        n -= 1def stoploop(task):
    print("执行结束, task n is {}".format(task.result()))
    loop.stop()

loop = asyncio.get_event_loop()
task = loop.create_task(test(5))
task2 = loop.create_task(test2(3))
task.add_done_callback(stoploop)
task2 = loop.create_task(test2(3))

loop.run_forever()复制代码

我们使用loop = asyncio.get_event_loop() 创建了一个事件循环对象loop, 并且在loop上创建了两个task, 并且给task1添加了一个回调函数,在task1它执行结束以后,将loop停掉.

注意看上面的代码,我们并没有在某处使用await来执行协程,而是通过将协程注册到某个事件循环对象上,然后调用该循环的run_forever() 函数,从而使该循环上的协程对象得以正常的执行.

上面得到的输出为

test 5
test2 3
test 4
test2 2
test 3
test2 1
test 2
test 1
执行结束, task n is 0复制代码

可以看到,使用事件循环对象创建的task,在该循环执行run_forever() 以后就可以执行了.

如果不执行loop.run_forever() 函数,则注册在它上面的协程也不会执行

loop = asyncio.get_event_loop()
task = loop.create_task(test(5))
task.add_done_callback(stoploop)
task2 = loop.create_task(test2(3))
time.sleep(5)# loop.run_forever()复制代码

上面的代码将loop.run_forever() 注释掉,换成time.sleep(5) 停5秒, 这时脚本不会有任何输出,在停了5秒以后就中止了.

回到之前的日志发送远程服务器的代码,我们可以使用aiohttp封装一个发送数据的函数, 然后在emit中将这个函数注册到全局的事件循环对象loop中,最后再执行loop.run_forever() .

loop = asyncio.get_event_loop()class CustomHandler(logging.Handler):
    def __init__(self, host, uri, method="POST"):
        logging.Handler.__init__(self)
        self.url = "%s/%s" % (host, uri)
        method = method.upper()        if method not in ["GET", "POST"]:            raise ValueError("method must be GET or POST")
        self.method = method    # 使用aiohttp封装发送数据函数
    async def submit(self, data):
        timeout = aiohttp.ClientTimeout(total=6)        if self.method == "GET":            if self.url.find("?") >= 0:
                sep = '&'
            else:
                sep = '?'
            url = self.url + "%c%s" % (sep, urllib.parse.urlencode({"log": data}))            async with aiohttp.ClientSession(timeout=timeout) as session:                async with session.get(url) as resp:
                    print(await resp.text())        else:
            headers = {                "Content-type": "application/x-www-form-urlencoded",
            }            async with aiohttp.ClientSession(timeout=timeout, headers=headers) as session:                async with session.post(self.url, data={'log': data}) as resp:
                    print(await resp.text())        return True

    def emit(self, record):
        msg = self.format(record)
        loop.create_task(self.submit(msg))# 添加一个httphandlerhttp_handler = CustomHandler(r"http://127.0.0.1:1987", 'api/log/get')
http_handler.setLevel(logging.DEBUG)
http_handler.setFormatter(fmt)
logger.addHandler(http_handler)

logger.debug("今天天气不错")
logger.debug("是风和日丽的")

loop.run_forever()复制代码

这时脚本就可以正常的异步执行了.

loop.create_task(self.submit(msg)) 也可以使用

asyncio.ensure_future(self.submit(msg), loop=loop)

来代替,目的都是将协程对象注册到事件循环中.

但这种方式有一点要注意,loop.run_forever() 将会一直阻塞,所以需要有个地方调用loop.stop()方法. 可以注册到某个task的回调中.

相关免费学习推荐:python视频教程

The above is the detailed content of Understand how to send logs asynchronously to a remote server in python. For more information, please follow other related articles on the PHP Chinese website!

Statement:
This article is reproduced at:juejin.im. If there is any infringement, please contact admin@php.cn delete