Home >Web Front-end >JS Tutorial >Let's talk about how to use the multi-threading capabilities of Node.js to do asynchronous calculations

Let's talk about how to use the multi-threading capabilities of Node.js to do asynchronous calculations

青灯夜游
青灯夜游forward
2021-09-03 18:09:162002browse

How to do asynchronous calculation? The following article will introduce to you how to use the multi-threading capabilities of the browser and Node.js to do asynchronous calculations. I hope it will be helpful to you!

Let's talk about how to use the multi-threading capabilities of Node.js to do asynchronous calculations

It is said that Node.js can achieve high-performance servers, but what is high-performance?

All software codes are ultimately run through the CPU. Whether the CPU can be efficiently utilized is a sign of performance, which means it cannot be idle. [Recommended study: "nodejs Tutorial"]

When will it idle?

  • When the program is performing network and disk IO, the CPU is idle at this time, that is, idling.
  • Multi-core CPU can run multiple programs at the same time. If only one of the cores is used, the other cores will also be idling.

So, if you want to achieve high performance, you must solve these two problems.

The operating system provides an abstraction of threads. Different execution branches corresponding to the code can be run on different CPUs at the same time. This is a way to take advantage of the performance of multi-core CPUs.

If some threads are performing IO, they will be blocked and wait for the completion of reading and writing. This is a relatively inefficient way, so the operating system implements the DMA mechanism, which is the device controller. The hardware is responsible for moving from the device to the memory, and notifies the CPU when the move is completed. In this way, when some threads are doing IO, the threads can be paused and continue running after receiving the notification that the DMA transport data is completed.

Multi-threading and DMA are solutions provided by operating systems that take advantage of multi-core CPUs and solve IO problems such as CPU blocking.

Various programming languages ​​encapsulate this mechanism, and Node.js does the same. The reason why Node.js is high-performance is because of the design of asynchronous IO.

The asynchronous IO of Node.js is implemented in libuv, based on the asynchronous system calls provided by the operating system. This is generally hardware-level asynchronous, such as DMA to transfer data. However, some of the synchronous system calls will become asynchronous after being encapsulated by libuv. This is because there is a thread pool in libuv to perform these tasks and turn the synchronous API into asynchronous. The size of this thread pool can be set through the UV_THREADPOOL_SIZE environment variable. The default is 4.

Lets talk about how to use the multi-threading capabilities of Node.js to do asynchronous calculations

#Many of the asynchronous APIs we call in our code are implemented through threads.

For example:

const fsPromises = require('fs').promises;

const data = await fsPromises.readFile('./filename');

However, this asynchronous API only solves the IO problem, so how to take advantage of the multi-core CPU for calculations?

Node.js introduced the worker_thread module experimentally in 10.5 (officially introduced in 12), which can create threads and ultimately run them with multiple CPUs. This is a way to use multi-core CPUs for calculations.

Asynchronous API can use multi-threads to do IO, and worker_thread can create threads to do calculations for different purposes.

To talk clearly about worker_thread, we have to start with the browser's web worker.

Browser's web worker

Browsers also face the problem of not being able to use multi-core CPUs for calculations, so html5 introduces web workers, which can be done through another thread calculate.

<!DOCTYPE html>
<html>
<head></head>
<body>
    <script>
        (async function () {
            const res = await runCalcWorker(2, 3, 3, 3);
            console.log(res);
        })();

        function runCalcWorker(...nums) {
            return new Promise((resolve, reject) => {
                const calcWorker = new Worker(&#39;./webWorker.js&#39;);
                calcWorker.postMessage(nums)
                calcWorker.onmessage = function (msg) {
                    resolve(msg.data);
                };
                calcWorker.onerror = reject;
            });
        }
    </script>

</body>
</html>

We create a Worker object, specify the js code running on another thread, then pass the message to it through postMessage, and receive the message through onMessage. This process is also asynchronous, and we further encapsulate it into a promise.

Then receive data in webWorker.js, do calculations, and then return the results through postMessage.

// webWorker.js
onmessage = function(msg) {
    if (Array.isArray(msg.data)) {
        const res = msg.data.reduce((total, cur) => {
            return total += cur;
        }, 0);
        postMessage(res);
    }
}

In this way, we use another CPU core to run this calculation. For writing code, it is no different from ordinary asynchronous code. But this asynchronous is actually not IO asynchronous, but computational asynchronous.

The worker thread of Node.js is similar to the web worker. I even suspect that the name of the worker thread is influenced by the web worker.

Node.js worker thread

If the above asynchronous calculation logic is implemented in Node.js, it will look like this:

const runCalcWorker = require(&#39;./runCalcWorker&#39;);

(async function () {
    const res = await runCalcWorker(2, 3, 3, 3);
    console.log(res);
})();

Call asynchronously, because there is no difference in usage between asynchronous calculation and asynchronous IO.

// runCalcWorker.js
const  { Worker } = require(&#39;worker_threads&#39;);

module.exports = function(...nums) {
    return new Promise(function(resolve, reject) {
        const calcWorker = new Worker(&#39;./nodeWorker.js&#39;);
        calcWorker.postMessage(nums);

        calcWorker.on(&#39;message&#39;, resolve);
        calcWorker.on(&#39;error&#39;, reject);
    });
}

Then asynchronous calculation is implemented by creating a Worker object, specifying JS to run in another thread, and then passing the message through postMessage and receiving the message through message. This is very similar to web workers.

// nodeWorker.js
const {
    parentPort
} = require(&#39;worker_threads&#39;);

parentPort.on(&#39;message&#39;, (data) => {
    const res = data.reduce((total, cur) => {
        return total += cur;
    }, 0);
    parentPort.postMessage(res);
});

In nodeWorker.js that specifically performs the calculation, listen to the message message, then perform the calculation, and return the data through parentPost.postMessage.

Compare web worker, you will find a special similarity. Therefore, I think the API of Node.js's worker thread is designed with reference to web worker.

However, in fact, the worker thread also supports passing data through workerData when it is created:

const  { Worker } = require(&#39;worker_threads&#39;);

module.exports = function(...nums) {
    return new Promise(function(resolve, reject) {
        const calcWorker = new Worker(&#39;./nodeWorker.js&#39;, {
            workerData: nums
        });
        calcWorker.on(&#39;message&#39;, resolve);
        calcWorker.on(&#39;error&#39;, reject);
    });
}

Then the worker thread can retrieve it through workerData:

const {
    parentPort,
    workerData
} = require(&#39;worker_threads&#39;);

const data = workerData;
const res = data.reduce((total, cur) => {
    return total += cur;
}, 0);
parentPort.postMessage(res);

因为有个传递消息的机制,所以要做序列化和反序列化,像函数这种无法被序列化的数据就无法传输了。这也是 worker thread 的特点。

Node.js 的 worker thread 和 浏览器 web woker 的对比

从使用上来看,都可以封装成普通的异步调用,和其他异步 API 用起来没啥区别。

都要经过数据的序列化反序列化,都支持 postMessage、onMessage 来收发消息。

除了 message,Node.js 的 worker thread 支持传递数据的方式更多,比如还有 workerData。

但从本质上来看,两者都是为了实现异步计算,充分利用多核 CPU 的性能,没啥区别。

总结

高性能的程序也就是要充分利用 CPU 资源,不要让它空转,也就是 IO 的时候不要让 CPU 等,多核 CPU 也要能同时利用起来做计算。操作系统提供了线程、DMA的机制来解决这种问题。Node.js 也做了相应的封装,也就是 libuv 实现的异步 IO 的 api,但是计算的异步是 Node 12 才正式引入的,也就是 worker thread,api 设计参考了浏览器的 web worker,传递消息通过 postMessage、onMessage,需要做数据的序列化,所以函数是没法传递的。

从使用上来看异步计算、异步 IO 使用方式一样,但是异步 IO 只是让 cpu 不同阻塞的等待 IO 完成,异步计算是利用了多核 CPU 同时进行并行的计算,数倍提升计算性能。

更多编程相关知识,请访问:编程视频!!

The above is the detailed content of Let's talk about how to use the multi-threading capabilities of Node.js to do asynchronous calculations. For more information, please follow other related articles on the PHP Chinese website!

Statement:
This article is reproduced at:juejin.cn. If there is any infringement, please contact admin@php.cn delete