node+async implements concurrency control-JS Tutorial-php.cn

Home

Web Front-end

JS Tutorial

node+async implements concurrency control

php中世界最好的语言

May 12, 2018 am 10:40 AM

accomplishconcurrent

This time I will bring you node async to control concurrency. What are the precautions for node async to control concurrency? The following is a practical case, let's take a look.

Objective

Create a lesson5 project and write code in it.

The entry point of the code is app.js. When node app.js is called, it will output the titles of all topics on the CNode (

https://cnodejs.org/) community homepage. Link and first comment in json format.

Note: Unlike the previous lesson, the number of concurrent connections needs to be controlled at 5.

Output example:

[
 {
  "title": "【公告】发招聘帖的同学留意一下这里",
  "href": "http://cnodejs.org/topic/541ed2d05e28155f24676a12",
  "comment1": "呵呵呵呵"
 },
 {
  "title": "发布一款 Sublime Text 下的 JavaScript 语法高亮插件",
  "href": "http://cnodejs.org/topic/54207e2efffeb6de3d61f68f",
  "comment1": "沙发！"
 }
]

Knowledge points##Learn async(https://github.com/caolan/async ) usage of. Here is a detailed async demo: https://github.com/alsotang/async_demo

Learn to use async to control the number of concurrent connections.

Course content#lesson4’s code is actually imperfect. The reason why we say this is because in lesson4, we sent 40 concurrent requests at one time. You must know that, except for CNode, other websites may treat you as a malicious request because you send too many concurrent connections. , block your IP.

When we write a crawler, if there are 1,000 links to crawl, it is impossible to send out 1,000 concurrent links at the same time, right? We need to control the number of concurrencies, for example, 10 concurrencies, and then slowly capture these 1,000 links.

Doing this with async is easy.

This time we are going to introduce the

mapLimit(arr, limit, iterator, callback)

interface of async. In addition, there is a commonly used interface for controlling the number of concurrent connections: queue(worker, concurrency). You can go to https://github.com/caolan/async#queueworker-concurrency for instructions. This time I won’t take you to crawl the website. Let’s focus on the knowledge point: controlling the number of concurrent connections.

By the way, another question is, when to use eventproxy and when to use async? Aren't they all used for asynchronous

process control

? My answer is:

When you need to go to multiple sources (usually less than 10)

to summarize data

, it is convenient to use eventproxy; when you need to use Use async when you want to queue, need to control the number of concurrency, or if you like functional programming thinking. Most scenarios are the former, so I personally use eventproxy most of the time. The main topic begins.

First, we forge a

fetchUrl(url, callback)

function. The function of this function is that when you call it through <pre class="brush:php;toolbar:false">fetchUrl('http://www.baidu.com', function (err, content) { // do something with `content` });</pre>, it will return http: //The page content of www.baidu.com returns.

Of course, the return content here is false, and the return delay is random. And when it is called, it will tell you how many places it is being called concurrently.

// 并发连接数的计数器
var concurrencyCount = 0;
var fetchUrl = function (url, callback) {
 // delay 的值在 2000 以内，是个随机的整数
 var delay = parseInt((Math.random() * 10000000) % 2000, 10);
 concurrencyCount++;
 console.log('现在的并发数是', concurrencyCount, '，正在抓取的是', url, '，耗时' + delay + '毫秒');
 setTimeout(function () {
  concurrencyCount--;
  callback(null, url + ' html content');
 }, delay);
};

Let’s then forge a set of links

var urls = [];
for(var i = 0; i <p style="text-align: left;"> This set of links looks like this: </p><p style="text-align: left;"></p><p style="text-align: left;"><img src="/static/imghwm/default1.png" data-src="https://img.php.cn/upload/article/000/061/021/c4f52947db9b70c84409f2036c691c07-0.png?x-oss-process=image/resize,p_40" class="lazy" alt="">Next, we use async.mapLimit to concurrently crawl and obtain results. </p><pre class="brush:php;toolbar:false">async.mapLimit(urls, 5, function (url, callback) {
 fetchUrl(url, callback);
}, function (err, result) {
 console.log('final:');
 console.log(result);
});

The running output is like this:

It can be seen that at the beginning, the number of concurrent links starts to grow from 1, and when it grows to 5, It will no longer increase. When one of the tasks is completed, continue fetching. The number of concurrent connections is always limited to 5.

I believe you have mastered the method after reading the case in this article. For more exciting information, please pay attention to other related articles on the php Chinese website!

Recommended reading:

PHP quick implementation of array deduplication method

react-navigation usage summary (with code)

The above is the detailed content of node+async implements concurrency control. For more information, please follow other related articles on the PHP Chinese website!

Statement

The content of this article is voluntarily contributed by netizens, and the copyright belongs to the original author. This site does not assume corresponding legal responsibility. If you find any content suspected of plagiarism or infringement, please contact admin@php.cn

Go 语言中的 goroutine 是什么？Jun 11, 2023 am 11:50 AM

Go语言是一种开源编程语言，由Google开发并于2009年面世。这种语言在近年来越发受到关注，并被广泛用于开发网络服务、云计算等领域。Go语言最具特色的特点之一是它内置了goroutine（协程），这是一种轻量级的线程，可以在代码中方便地实现并发和并行计算。那么goroutine到底是什么呢？简单来说，goroutine就是Go语言中的

Java 中的锁机制Jun 08, 2023 am 08:03 AM

Java作为一种高级编程语言，在并发编程中有着广泛的应用。在多线程环境下，为了保证数据的正确性和一致性，Java采用了锁机制。本文将从锁的概念、类型、实现方式和使用场景等方面对Java中的锁机制进行探讨。一、锁的概念锁是一种同步机制，用于控制多个线程之间对共享资源的访问。在多线程环境下，线程的执行是并发的，多个线程可能会同时修改同一数据，这就会导致数

如何解决Python的函数中的并发不安全错误？Jun 24, 2023 pm 12:37 PM

Python是一门流行的高级编程语言，它具有简单易懂的语法、丰富的标准库和开源社区的支持，而且还支持多种编程范式，例如面向对象编程、函数式编程等。尤其是Python在数据处理、机器学习、科学计算等领域有着广泛的应用。然而，在多线程或多进程编程中，Python也存在一些问题。其中之一就是并发不安全。本文将从以下几个方面介绍如何解决Python的函数中的并发不安

PHP8.0如何使用Fibers实现并发May 14, 2023 am 09:01 AM

随着现代互联网技术的不断发展，网站访问量越来越大，对于服务器的并发处理能力也提出了更高的要求。如何提高服务器的并发处理能力是每个开发者需要面对的问题。在这个背景下，PHP8.0引入了Fibers这一全新的特性，让PHP开发者掌握一种全新的并发处理方式。Fibers是什么？首先，我们需要了解什么是Fibers。Fibers是一种轻量级的线程，可以高效地支持PH

Java的并发异常——java.util.ConcurrentModificationException怎么办？Jun 25, 2023 am 11:46 AM

Java作为一种高级语言，在编程语言中使用广泛。在Java的应用程序和框架的开发中，我们经常会碰到并发的问题。并发问题是指当多个线程同时对同一个对象进行操作时，会产生一些意想不到的结果，这些问题称为并发问题。其中的一个常见的异常就是java.util.ConcurrentModificationException异常，那么我们在开发过程中如何有效地解决这个异

使用Go和Goroutines实现高效的并发图计算Jul 21, 2023 pm 03:58 PM

使用Go和Goroutines实现高效的并发图计算引言：随着大数据时代的到来，图计算问题也成为了一个热门的研究领域。在图计算中，图的顶点和边之间的关系非常复杂，因此如果采用传统的串行方法进行计算，往往会遇到性能瓶颈。为了提高计算效率，我们可以利用并发编程的方法使用多个线程同时进行计算。今天我将向大家介绍使用Go和Goroutines实现高效的并发图计算的方法

Java中如何使用ConcurrentLinkedQueue函数进行并发队列操作Jun 26, 2023 pm 05:37 PM

Java中的ConcurrentLinkedQueue函数为开发者提供了一种线程安全的、高效的队列实现方式，它支持并发读写操作，并且执行效率较高。在本文中，我们将介绍Java中如何使用ConcurrentLinkedQueue函数进行并发队列操作，帮助开发者更好地利用其优势。ConcurrentLinkedQueue是Java中的一个线程安全、非阻塞的队列实

Swoole实践：如何利用协程优化多进程并发访问Jun 13, 2023 pm 09:41 PM

随着Web应用程序越来越复杂，访问并发处理和性能优化变得越来越重要。在许多情况下，使用多进程或线程处理并发请求是解决方案。然而，在这种情况下，需要考虑上下文切换和内存占用等问题。在本文中，我们将介绍如何使用Swoole和协程来优化多进程并发访问。Swoole是一个基于PHP的协程异步网络通信引擎，它允许我们非常方便地实现高性能的网络通信。Swoole协程简

See all articles