Home  >  Article  >  Web Front-end  >  What is libuv, a brief analysis of event polling in libuv (Node core dependency)

What is libuv, a brief analysis of event polling in libuv (Node core dependency)

青灯夜游
青灯夜游forward
2022-03-22 19:58:553826browse

This article will take you to understand the core dependency of Node, libuv, and introduce what libuv is and the event polling in libuv. I hope it will be helpful to everyone!

What is libuv, a brief analysis of event polling in libuv (Node core dependency)

When it comes to Node.js, I believe that most front-end engineers will think of developing servers based on it. They only need to master JavaScript as a language. Become a full-stack engineer, but in fact the meaning of Node.js is not just that.

For many high-level languages, execution permissions can reach the operating system, but JavaScript running on the browser side is an exception. The sandbox environment created by the browser closes the front-end engineers in a In the ivory tower of the programming world. However, the emergence of Node.js has made up for this shortcoming, and front-end engineers can also reach the bottom of the computer world.

So Nodejs The significance for front-end engineers is not only to provide full-stack development capabilities, but more importantly, it opens a door to the underlying world of computers for front-end engineers. This article opens this door by analyzing the implementation principles of Node.js.

Node.js source code structure

There are more than a dozen dependencies in the /deps directory of the Node.js source code warehouse, including modules written in C language (such as libuv, V8) and JavaScript language The written modules (such as acorn, acorn-plugins) are as shown in the figure below.

What is libuv, a brief analysis of event polling in libuv (Node core dependency)

  • #acorn: A lightweight JavaScript parser written in JavaScript.
  • acorn-plugins: acorn’s extension module, allowing acorn to support ES6 feature parsing, such as class declarations.
  • brotli: Brotli compression algorithm written in C language.
  • cares: should be written as "c-ares", written in C language to handle asynchronous DNS requests.
  • histogram: written in C language to implement the histogram generation function.
  • icu-small: ICU (International Components for Unicode) library written in C language and customized for Node.js, including some functions for operating Unicode.
  • llhttp: written in C language, lightweight http parser.
  • nghttp2/nghttp3/ngtcp2: Handle HTTP/2, HTTP/3, TCP/2 protocols.
  • node-inspect: Let Node.js program support CLI debug debugging mode.
  • npm: Node.js module manager written in JavaScript.
  • openssl: written in C language, encryption-related module, used in both tls and crypto modules.
  • uv: Written in C language, using non-blocking I/O operations, providing Node.js with the ability to access system resources.
  • uvwasi: written in C language, implements WASI system call API.
  • v8: Written in C language, JavaScript engine.
  • zlib: For fast compression, Node.js uses zlib to create synchronous, asynchronous and data stream compression and decompression interfaces.

The most important ones are the modules corresponding to the v8 and uv directories. V8 itself does not have the ability to run asynchronously, but is implemented with the help of other threads in the browser. This is why we often say that js is single-threaded, because its parsing engine only supports synchronous parsing code. But in Node.js, asynchronous implementation mainly relies on libuv. Let’s focus on analyzing the implementation principle of libuv.

What is libuv

libuv is an asynchronous I/O library written in C that supports multiple platforms. It mainly solves the problem that I/O operations are prone to blocking. Originally developed specifically for use with Node.js, but later also used by other modules such as Luvit, Julia, pyuv, etc. The following figure is the structure diagram of libuv.

What is libuv, a brief analysis of event polling in libuv (Node core dependency)

#libuv has two asynchronous implementation methods, which are the two parts selected by the yellow box on the left and right of the picture above.

The left part is the network I/O module, which has different implementation mechanisms under different platforms. It is implemented through epoll under Linux system, OSX and other BSD systems use KQueue, SunOS system uses Event ports, and Windows system uses It's IOCP. Since it involves the underlying API of the operating system, it is more complicated to understand, so I won’t introduce it here.

The right part includes the file I/O module, DNS module and user code, which implements asynchronous operations through the thread pool. File I/O is different from network I/O. libuv does not rely on the underlying API of the system. Instead, it performs blocking file I/O operations in the global thread pool.

Event polling in libuv

The following figure is the event polling workflow diagram given by the libuv official website. Let's analyze it together with the code.

What is libuv, a brief analysis of event polling in libuv (Node core dependency)

The core code of the libuv event loop is implemented in the uv_run() function. The following is part of the core code under the Unix system. Although it is written in C language, it is a high-level language like JavaScript, so it is not too difficult to understand. The biggest difference may be the asterisks and arrows. We can simply ignore the asterisks. For example, uv_loop_t* loop in the function parameter can be understood as a variable loop of type uv_loop_t. The arrow "→" can be understood as the period ".", for example, loop→stop_flag can be understood as loop.stop_flag.

int uv_run(uv_loop_t* loop, uv_run_mode mode) {
  ... 
r = uv__loop_alive(loop);
if (!r) uv__update_time(loop);
while (r != 0 && loop - >stop_flag == 0) {
    uv__update_time(loop);
    uv__run_timers(loop);
    ran_pending = uv__run_pending(loop);
    uv__run_idle(loop);
    uv__run_prepare(loop);...uv__io_poll(loop, timeout);
    uv__run_check(loop);
    uv__run_closing_handles(loop);...
}...
}

uv__loop_alive

This function is used to determine whether event polling should continue. If there is no active task in the loop object, it will return 0 and exit the loop.

In C language, this "task" has a professional name, that is, "handle", which can be understood as a variable pointing to the task. Handles can be divided into two categories: request and handle, which represent short-life cycle handles and long-life cycle handles respectively. The specific code is as follows:

static int uv__loop_alive(const uv_loop_t * loop) {
    return uv__has_active_handles(loop) || uv__has_active_reqs(loop) || loop - >closing_handles != NULL;
}

uv__update_time

In order to reduce the number of time-related system calls, this function is used to cache the current system time. The accuracy is very high, reaching the nanosecond level, but the unit is still milliseconds.

The specific source code is as follows:

UV_UNUSED(static void uv__update_time(uv_loop_t * loop)) {
    loop - >time = uv__hrtime(UV_CLOCK_FAST) / 1000000;
}

uv__run_timers

Execute the callbacks that reach the time threshold in setTimeout() and setInterval() function. This execution process is implemented through for loop traversal. As you can see from the code below, the timer callback is stored in the data of a minimum heap structure. It exits when the minimum heap is empty or has not reached the time threshold. cycle.

Remove the timer before executing the timer callback function. If repeat is set, it needs to be added to the minimum heap again, and then the timer callback is executed.

The specific code is as follows:

void uv__run_timers(uv_loop_t * loop) {
    struct heap_node * heap_node;
    uv_timer_t * handle;
    for (;;) {
        heap_node = heap_min(timer_heap(loop));
        if (heap_node == NULL) break;
        handle = container_of(heap_node, uv_timer_t, heap_node);
        if (handle - >timeout > loop - >time) break;
        uv_timer_stop(handle);
        uv_timer_again(handle);
        handle - >timer_cb(handle);
    }
}

uv__run_pending

Traverse all I/O callback functions stored in pending_queue, when Returns 0 when pending_queue is empty; otherwise returns 1 after executing the callback function in pending_queue.

The code is as follows:

static int uv__run_pending(uv_loop_t * loop) {
    QUEUE * q;
    QUEUE pq;
    uv__io_t * w;
    if (QUEUE_EMPTY( & loop - >pending_queue)) return 0;
    QUEUE_MOVE( & loop - >pending_queue, &pq);
    while (!QUEUE_EMPTY( & pq)) {
        q = QUEUE_HEAD( & pq);
        QUEUE_REMOVE(q);
        QUEUE_INIT(q);
        w = QUEUE_DATA(q, uv__io_t, pending_queue);
        w - >cb(loop, w, POLLOUT);
    }
    return 1;
}

uvrun_idle / uvrun_prepare / uv__run_check

These three functions are all passed through a macro function UV_LOOP_WATCHER_DEFINE For definition, macro functions can be understood as code templates, or functions used to define functions. The macro function is called three times and the name parameter values ​​prepare, check, and idle are passed in respectively. At the same time, three functions, uvrun_idle, uvrun_prepare, and uv__run_check, are defined.

So their execution logic is consistent. They all loop through and take out the objects in the queue loop->name##_handles according to the first-in-first-out principle, and then execute the corresponding callback function.

#define UV_LOOP_WATCHER_DEFINE(name, type)
void uv__run_##name(uv_loop_t* loop) {
  uv_##name##_t* h;
  QUEUE queue;
  QUEUE* q;
  QUEUE_MOVE(&loop->name##_handles, &queue);
  while (!QUEUE_EMPTY(&queue)) {
    q = QUEUE_HEAD(&queue);
    h = QUEUE_DATA(q, uv_##name##_t, queue);
    QUEUE_REMOVE(q);
    QUEUE_INSERT_TAIL(&loop->name##_handles, q);
    h->name##_cb(h);
  }
}
UV_LOOP_WATCHER_DEFINE(prepare, PREPARE) 
UV_LOOP_WATCHER_DEFINE(check, CHECK) 
UV_LOOP_WATCHER_DEFINE(idle, IDLE)

uv__io_poll

##uv__io_poll is mainly used to poll I/O operations. The specific implementation will vary depending on the operating system. We take the Linux system as an example for analysis.

uv__io_poll function has a lot of source code, the core is two pieces of loop code, part of the code is as follows:

void uv__io_poll(uv_loop_t * loop, int timeout) {
    while (!QUEUE_EMPTY( & loop - >watcher_queue)) {
        q = QUEUE_HEAD( & loop - >watcher_queue);
        QUEUE_REMOVE(q);
        QUEUE_INIT(q);
        w = QUEUE_DATA(q, uv__io_t, watcher_queue);
        e.events = w - >pevents;
        e.data.fd = w - >fd;
        if (w - >events == 0) op = EPOLL_CTL_ADD;
        else op = EPOLL_CTL_MOD;
        if (epoll_ctl(loop - >backend_fd, op, w - >fd, &e)) {
            if (errno != EEXIST) abort();
            if (epoll_ctl(loop - >backend_fd, EPOLL_CTL_MOD, w - >fd, &e)) abort();
        }
        w - >events = w - >pevents;
    }
    for (;;) {
        for (i = 0; i < nfds; i++) {
            pe = events + i;
            fd = pe - >data.fd;
            w = loop - >watchers[fd];
            pe - >events &= w - >pevents | POLLERR | POLLHUP;
            if (pe - >events == POLLERR || pe - >events == POLLHUP) pe - >events |= w - >pevents & (POLLIN | POLLOUT | UV__POLLRDHUP | UV__POLLPRI);
            if (pe - >events != 0) {
                if (w == &loop - >signal_io_watcher) have_signals = 1;
                else w - >cb(loop, w, pe - >events);
                nevents++;
            }
        }
        if (have_signals != 0) loop - >signal_io_watcher.cb(loop, &loop - >signal_io_watcher, POLLIN);
    }...
}

In the while loop, traverse the observer queue watcher_queue, and take out the event and file descriptor Assign the value to the event object e, and then call the epoll_ctl function to register or modify the epoll event.

In the for loop, the file descriptor waiting in epoll will first be taken out and assigned to nfds, and then nfds will be traversed to execute the callback function.

uv__run_closing_handles

Traverse the queue waiting to be closed, close handles such as stream, tcp, udp, etc., and then call the close_cb corresponding to the handle. The code is as follows:

static void uv__run_closing_handles(uv_loop_t * loop) {
    uv_handle_t * p;
    uv_handle_t * q;
    p = loop - >closing_handles;
    loop - >closing_handles = NULL;
    while (p) {
        q = p - >next_closing;
        uv__finish_close(p);
        p = q;
    }
}

process.nextTick and Promise

Although process.nextTick and Promise are both asynchronous APIs, they are not part of event polling. They have their own task queues. Executed after each step of the event loop is completed. So when we use these two asynchronous APIs, we need to pay attention. If long tasks or recursions are performed in the incoming callback function, the event polling will be blocked, thus "starving" I/O operations.

The following code is an example of a recursive call to prcoess.nextTick that causes the callback function of fs.readFile to fail to execute.

fs.readFile(&#39;config.json&#39;, (err, data) = >{...
}) const traverse = () = >{
    process.nextTick(traverse)
}

To solve this problem, you can use setImmediate instead, because setImmediate will execute the callback function queue in the event polling. The process.nextTick task queue has a higher priority than the Promise task queue. For the specific reasons, please refer to the following code:

function processTicksAndRejections() {
    let tock;
    do {
        while (tock = queue.shift()) {
            const asyncId = tock[async_id_symbol];
            emitBefore(asyncId, tock[trigger_async_id_symbol], tock);
            try {
                const callback = tock.callback;
                if (tock.args === undefined) {
                    callback();
                } else {
                    const args = tock.args;
                    switch (args.length) {
                    case 1:
                        callback(args[0]);
                        break;
                    case 2:
                        callback(args[0], args[1]);
                        break;
                    case 3:
                        callback(args[0], args[1], args[2]);
                        break;
                    case 4:
                        callback(args[0], args[1], args[2], args[3]);
                        break;
                    default:
                        callback(...args);
                    }
                }
            } finally {
                if (destroyHooksExist()) emitDestroy(asyncId);
            }
            emitAfter(asyncId);
        }
        runMicrotasks();
    } while (! queue . isEmpty () || processPromiseRejections());
    setHasTickScheduled(false);
    setHasRejectionToWarn(false);
}

As can be seen from the processTicksAndRejections() function, first the callback function of the queue queue is taken out through the while loop. , and the callback function in this queue is added through process.nextTick. When the while loop ends, the runMicrotasks() function is called to execute the Promise callback function.

Summary

The core structure of Node.js that relies on libuv can be divided into two parts. One part is network I/O. The underlying implementation will rely on different system APIs according to different operating systems. The other part is File I/O, DNS, and user code are handled by the thread pool.

libuv's core mechanism for handling asynchronous operations is event polling. Event polling is divided into several steps. The general operation is to traverse and execute the callback function in the queue.

Finally, it is mentioned that the asynchronous API process.nextTick and Promise do not belong to event polling. Improper use will cause event polling to block. One solution is to use setImmediate instead.

For more node-related knowledge, please visit: nodejs tutorial!

The above is the detailed content of What is libuv, a brief analysis of event polling in libuv (Node core dependency). For more information, please follow other related articles on the PHP Chinese website!

Statement:
This article is reproduced at:juejin.cn. If there is any infringement, please contact admin@php.cn delete