search
HomeBackend DevelopmentPHP TutorialUsage of curl_multi function set in php

1. Introduction

I’ve been quite busy during this period, and I haven’t written a blog for a long time. Today I will talk about my experience in using the curl_multi_* function set and the issue of http requests.

When our user php initiates an http request. What would we first think of using? That's right, we will create curl to request. What if we need to initiate multiple http requests in one execution. This is simple, make a url request for each URL. Request to play the first one and then request the second one... Is this the end? What else can we say?

Official website link: http://php.net/manual/zh/book.curl.php

2. Disadvantages of multiple simple curl requests

         php中curl_multi函数集的用法

Let’s give a chestnut. There are now three http requests. Each request takes 2s. If you follow a simple curl request (Figure 1-(1)). It took 6 seconds. This is intolerable. The more requests there are, the more time it takes.

Is there a way to reduce query time? Can three http requests be executed at the same time (Figure 1-(1))? There are many ways to solve this problem and reduce the time consumption to 2 seconds. Such as: multi-process, thread, event loop, curl_multi_*, etc. The simplest way is to use curl_multi_* functions. In fact, the internal implementation of curl_multi_* uses an event loop.

3. Simple curl_multi_* application

<code><span><?php </span><span>/**
 *
 * curl_multi_*简单运用
 *
 *<span> @author</span>: rudy
 *<span> @date</span>: 2016/07/12
 */</span><span>/**
 * 根据url,postData获取curl请求对象,这个比较简单,可以看官方文档
 */</span><span><span>function</span><span>getCurlObject</span><span>(<span>$url</span>,<span>$postData</span>=array<span>()</span>,<span>$header</span>=array<span>()</span>)</span>{</span><span>$options</span> = <span>array</span>();
    <span>$url</span> = trim(<span>$url</span>);
    <span>$options</span>[CURLOPT_URL] = <span>$url</span>;
    <span>$options</span>[CURLOPT_TIMEOUT] = <span>10</span>;
    <span>$options</span>[CURLOPT_USERAGENT] = <span>'Mozilla/5.0 (Macintosh; Intel Mac OS X 10_10_4) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/44.0.2403.89 Safari/537.36'</span>;
    <span>$options</span>[CURLOPT_RETURNTRANSFER] = <span>true</span>;
    <span>//    $options[CURLOPT_PROXY] = '127.0.0.1:8888';</span><span>foreach</span>(<span>$header</span><span>as</span><span>$key</span>=><span>$value</span>){
        <span>$options</span>[<span>$key</span>] =<span>$value</span>;
    }
    <span>if</span>(!<span>empty</span>(<span>$postData</span>) && is_array(<span>$postData</span>)){
        <span>$options</span>[CURLOPT_POST] = <span>true</span>;
        <span>$options</span>[CURLOPT_POSTFIELDS] = http_build_query(<span>$postData</span>);
    }
    <span>if</span>(stripos(<span>$url</span>,<span>'https'</span>) === <span>0</span>){
        <span>$options</span>[CURLOPT_SSL_VERIFYPEER] = <span>false</span>;
    }
    <span>$ch</span> = curl_init();
    curl_setopt_array(<span>$ch</span>,<span>$options</span>);

    <span>return</span><span>$ch</span>;
}

<span>// 创建三个待请求的url对象</span><span>$chList</span> = <span>array</span>();
<span>$chList</span>[] = getCurlObject(<span>'https://www.baidu.com'</span>);
<span>$chList</span>[] = getCurlObject(<span>'http://www.jd.com'</span>);
<span>$chList</span>[] = getCurlObject(<span>'http://www.jianshu.com/'</span>);

<span>// 创建多请求执行对象</span><span>$downloader</span> = curl_multi_init();

<span>// 将三个待请求对象放入下载器中</span><span>foreach</span> (<span>$chList</span><span>as</span><span>$ch</span>){
    curl_multi_add_handle(<span>$downloader</span>,<span>$ch</span>);
}

<span>// 轮询</span><span>do</span> {
    <span>while</span> ((<span>$execrun</span> = curl_multi_exec(<span>$downloader</span>, <span>$running</span>)) == CURLM_CALL_MULTI_PERFORM) ;
    <span>if</span> (<span>$execrun</span> != CURLM_OK) {
        <span>break</span>;
    }

    <span>// 一旦有一个请求完成,找出来,处理,因为curl底层是select,所以最大受限于1024</span><span>while</span> (<span>$done</span> = curl_multi_info_read(<span>$downloader</span>))
    {
        <span>// 从请求中获取信息、内容、错误</span><span>$info</span> = curl_getinfo(<span>$done</span>[<span>'handle'</span>]);
        <span>$output</span> = curl_multi_getcontent(<span>$done</span>[<span>'handle'</span>]);
        <span>$error</span> = curl_error(<span>$done</span>[<span>'handle'</span>]);

        <span>// 将请求结果保存,我这里是打印出来</span><span>print</span><span>$output</span>;
<span>//        print "一个请求下载完成!\n";</span><span>// 把请求已经完成了得 curl handle 删除</span>
        curl_multi_remove_handle(<span>$downloader</span>, <span>$done</span>[<span>'handle'</span>]);
    }

    <span>// 当没有数据的时候进行堵塞,把 CPU 使用权交出来,避免上面 do 死循环空跑数据导致 CPU 100%</span><span>if</span> (<span>$running</span>) {
        <span>$rel</span> = curl_multi_select(<span>$downloader</span>, <span>1</span>);
        <span>if</span>(<span>$rel</span> == -<span>1</span>){
            usleep(<span>1000</span>);
        }
    }

    <span>if</span>( <span>$running</span> == <span>false</span>){
        <span>break</span>;
    }
} <span>while</span> (<span>true</span>);

<span>// 下载完毕,关闭下载器</span>
curl_multi_close(<span>$downloader</span>);
<span>echo</span><span>"所有请求下载完成!"</span>;</span></code>

In this example, first create three or more url request objects to be requested. Create a downloader through curl_multi_* functions. Write the request to the downloader. Last poll. Waiting for three requests to complete now. Do processing.

4. Complex curl_multi_* usage

Is this how curl_multi_* is used? too yong too simple! In the above example. The requests in the downloader $downloader are added from the beginning. Can we dynamically add requests to the downloader. Dynamically retrieve completed requests from the downloader. Think about it. What's this? Isn't this the core part of the crawler - the dynamic downloader. How to add it dynamically? We can use multi-process to add via IPC. We can add waits through queues through coroutines.

I implemented a crawler framework through coroutine + curl_multi_*.
Tspider: https://github.com/hirudy/Tspider.
A single process can handle requests 2000-5000/min.

').addClass('pre-numbering').hide(); $(this).addClass('has-numbering').parent().append($numbering); for (i = 1; i ').text(i)); }; $numbering.fadeIn(1700); }); });

The above introduces the usage of the curl_multi function set in PHP, including the relevant content. I hope it will be helpful to friends who are interested in PHP tutorials.

Statement
The content of this article is voluntarily contributed by netizens, and the copyright belongs to the original author. This site does not assume corresponding legal responsibility. If you find any content suspected of plagiarism or infringement, please contact admin@php.cn
What is the difference between unset() and session_destroy()?What is the difference between unset() and session_destroy()?May 04, 2025 am 12:19 AM

Thedifferencebetweenunset()andsession_destroy()isthatunset()clearsspecificsessionvariableswhilekeepingthesessionactive,whereassession_destroy()terminatestheentiresession.1)Useunset()toremovespecificsessionvariableswithoutaffectingthesession'soveralls

What is sticky sessions (session affinity) in the context of load balancing?What is sticky sessions (session affinity) in the context of load balancing?May 04, 2025 am 12:16 AM

Stickysessionsensureuserrequestsareroutedtothesameserverforsessiondataconsistency.1)SessionIdentificationassignsuserstoserversusingcookiesorURLmodifications.2)ConsistentRoutingdirectssubsequentrequeststothesameserver.3)LoadBalancingdistributesnewuser

What are the different session save handlers available in PHP?What are the different session save handlers available in PHP?May 04, 2025 am 12:14 AM

PHPoffersvarioussessionsavehandlers:1)Files:Default,simplebutmaybottleneckonhigh-trafficsites.2)Memcached:High-performance,idealforspeed-criticalapplications.3)Redis:SimilartoMemcached,withaddedpersistence.4)Databases:Offerscontrol,usefulforintegrati

What is a session in PHP, and why are they used?What is a session in PHP, and why are they used?May 04, 2025 am 12:12 AM

Session in PHP is a mechanism for saving user data on the server side to maintain state between multiple requests. Specifically, 1) the session is started by the session_start() function, and data is stored and read through the $_SESSION super global array; 2) the session data is stored in the server's temporary files by default, but can be optimized through database or memory storage; 3) the session can be used to realize user login status tracking and shopping cart management functions; 4) Pay attention to the secure transmission and performance optimization of the session to ensure the security and efficiency of the application.

Explain the lifecycle of a PHP session.Explain the lifecycle of a PHP session.May 04, 2025 am 12:04 AM

PHPsessionsstartwithsession_start(),whichgeneratesauniqueIDandcreatesaserverfile;theypersistacrossrequestsandcanbemanuallyendedwithsession_destroy().1)Sessionsbeginwhensession_start()iscalled,creatingauniqueIDandserverfile.2)Theycontinueasdataisloade

What is the difference between absolute and idle session timeouts?What is the difference between absolute and idle session timeouts?May 03, 2025 am 12:21 AM

Absolute session timeout starts at the time of session creation, while an idle session timeout starts at the time of user's no operation. Absolute session timeout is suitable for scenarios where strict control of the session life cycle is required, such as financial applications; idle session timeout is suitable for applications that want users to keep their session active for a long time, such as social media.

What steps would you take if sessions aren't working on your server?What steps would you take if sessions aren't working on your server?May 03, 2025 am 12:19 AM

The server session failure can be solved through the following steps: 1. Check the server configuration to ensure that the session is set correctly. 2. Verify client cookies, confirm that the browser supports it and send it correctly. 3. Check session storage services, such as Redis, to ensure that they are running normally. 4. Review the application code to ensure the correct session logic. Through these steps, conversation problems can be effectively diagnosed and repaired and user experience can be improved.

What is the significance of the session_start() function?What is the significance of the session_start() function?May 03, 2025 am 12:18 AM

session_start()iscrucialinPHPformanagingusersessions.1)Itinitiatesanewsessionifnoneexists,2)resumesanexistingsession,and3)setsasessioncookieforcontinuityacrossrequests,enablingapplicationslikeuserauthenticationandpersonalizedcontent.

See all articles

Hot AI Tools

Undresser.AI Undress

Undresser.AI Undress

AI-powered app for creating realistic nude photos

AI Clothes Remover

AI Clothes Remover

Online AI tool for removing clothes from photos.

Undress AI Tool

Undress AI Tool

Undress images for free

Clothoff.io

Clothoff.io

AI clothes remover

Video Face Swap

Video Face Swap

Swap faces in any video effortlessly with our completely free AI face swap tool!

Hot Tools

DVWA

DVWA

Damn Vulnerable Web App (DVWA) is a PHP/MySQL web application that is very vulnerable. Its main goals are to be an aid for security professionals to test their skills and tools in a legal environment, to help web developers better understand the process of securing web applications, and to help teachers/students teach/learn in a classroom environment Web application security. The goal of DVWA is to practice some of the most common web vulnerabilities through a simple and straightforward interface, with varying degrees of difficulty. Please note that this software

Notepad++7.3.1

Notepad++7.3.1

Easy-to-use and free code editor

Safe Exam Browser

Safe Exam Browser

Safe Exam Browser is a secure browser environment for taking online exams securely. This software turns any computer into a secure workstation. It controls access to any utility and prevents students from using unauthorized resources.

Dreamweaver CS6

Dreamweaver CS6

Visual web development tools

SecLists

SecLists

SecLists is the ultimate security tester's companion. It is a collection of various types of lists that are frequently used during security assessments, all in one place. SecLists helps make security testing more efficient and productive by conveniently providing all the lists a security tester might need. List types include usernames, passwords, URLs, fuzzing payloads, sensitive data patterns, web shells, and more. The tester can simply pull this repository onto a new test machine and he will have access to every type of list he needs.