


PHP Simulates Sending POST Requests Part 5 Basic Use of Curl and Multi-Threading Optimization, Part 5 Curl
Today we will introduce the heavy weapon of PHP to simulate sending POST requests - the cURL function library Use and its multi-threaded optimization methods.
Speaking of the cURL function, it is a cliché, but a lot of information on the Internet is unclear in key parts. It lists a lot of things in the manual, which made me extremely painful when I started. I looked through some information and combined it with my own Notes, summarizing this blog post, I hope it can provide some help to developers who are new to cURL.
Basic steps for using cURL
First, let’s introduce cURL:
cURL simulates the browser to transmit data according to the HTTP header information. It supports FTP, FTPS, HTTP, HTTPS, DICT, FILE and other protocols. It has HTTPS authentication, HTTP POST method, HTTP PUT method, FTP upload, HTTP upload, proxy Server, cookies, username/password authentication and other functions. cURL can be said to be a powerful tool for crawling websites, grabbing web pages, POST data and other functions.
Using cURL functions is mainly divided into four parts:
1. Initialize cURL.
2. Set cURL variables, which is the core of cRUL, and the extended functionality depends entirely on this step.
3. Execute cURL and get the results.
4. Close the connection and recycle resources.
<span>$ch</span> = curl_init();<span>//</span><span>1</span> <span> curl_setopt(</span><span>$ch</span>, CURLOPT_URL, "http://localhost");<span>//</span><span>2</span> <span>$output</span> = curl_exec(<span>$ch</span>);<span>//</span><span>3</span> <span> curl_close(</span><span>$ch</span>);<span>//</span><span>4</span>
In addition, we can also use the curl_getinfo($ch) function to obtain curl execution information, and the result is an array
The contents of the $info array include the following:
- "url" //Resource network address
- “content_type” //Content encoding
- "http_code" //HTTP status code
- “filetime” //File creation time
- “total_time” //Total time spent
- “size_upload” //The size of the uploaded data
- “size_download” //The size of the downloaded data
- “speed_download” //Download speed
- “speed_upload” //Upload speed
- “download_content_length”//The length of the download content
- “upload_content_length” //The length of the uploaded content
Common settings for cURL
The following is a detailed introduction to the commonly used variable settings when using curl in the second step. When using the curl function, it can be set according to various needs.
Set basic information:
curl_setopt($ch, CURLOPT_URL, $string);//Set the directory address of curl
curl_setopt($ch, CURLOPT_PORT, $port);//Set the connection port, generally not set to the default 80
curl_setopt($ch, CURLOPT_RETURNTRANSFER, 1);//Returning the result stream does not output it for subsequent processing. This item is usually set to process the captured information later instead of outputting it directly.
Set POST data information:
curl_setopt($ch, CURLOPT_POST, 1);//Set the data transmission method to POST
curl_setopt($ch, CURLOPT_POSTFIELDS, $string);//Set the data to be transmitted
Set verification information:
curl_setopt($ch, CURLOPT_COOKIE, $string);//Set the cookie information carried when curl is executed
curl_setopt($ch, CURLOPT_USERAGENT, $string);//Set browser information for curl simulation
curl_setopt($ch, CURLOPT_REFERER, $string);//Set the referer in the header to help crack anti-leeches
curl_setopt($ch, CURLOPT_USERPWD, $string);//Pass the username and password required for a connection in the format: "[username]:[password]"
curl_setopt($ch, CURLOPT_FOLLOWLOCATION, 1);//Set to allow server redirection
Set enhanced information:
curl_setopt($ch, CURLOPT_NOBODY, 1);//The setting does not allow the output of HTML body. If you set this option when crawling information such as page titles, it will greatly speed up the speed
curl_setopt($ch, CURLOPT_TIMEOUT, $int);//Set the maximum number of seconds allowed for execution (timeout time). When the value is set to a small value, CURL will abandon pages that take a long time to execute
curl_setopt($ch, CURLOPT_HEADER, 1);//Set to allow the header header file generated when reading the target to be included in the output stream
Basic use of cURL batch processing function
Of course, the functions of cURL do not stop there. You can find more variable settings in the manual. And the most powerful part of cURL is its batch processing function.
cURL’s batch processing also seems to be easy to understand. The following are the general steps:
1.$mh = curl_multi_init();//Initialize a batch handle.
2.curl_multi_add_handle($mh,$ch); //Add the set $ch handle to the batch handle.
3.curl_multi_exec($mh,$running);//Execute the $mh handle and write the running status of the $mh handle into the $running variable
4. When $running is true, the curl_multi_close() function is executed in a loop
5.循环结束后遍历$mh句柄,用curl_multi_getcontent()获取第一个句柄的返回值
6.用curl_multi_remove_handle()将$mh中的句柄移除
7.用curl_multi_close()关闭$mh批处理句柄。
代码如下:
<?<span>php </span><span>$chArr</span>=<span>[]; </span><span>for</span>(<span>$i</span>=0;<span>$i</span><50;<span>$i</span>++<span>){ </span><span>$chArr</span>[<span>$i</span>]=curl_init("http://www.baidu.com"<span>); curl_setopt(</span><span>$chArr</span>[<span>$i</span>],CURLOPT_RETURNTRANSFER,1<span>); } </span><span>$mh</span> = curl_multi_init(); <span>//</span><span>1</span> <span>foreach</span>(<span>$chArr</span> <span>as</span> <span>$k</span> => <span>$ch</span><span>){ curl_multi_add_handle(</span><span>$mh</span>,<span>$ch</span>); <span>//</span><span>2</span> <span> } </span><span>$running</span> = <span>null</span><span>; </span><span>do</span><span>{ curl_multi_exec(</span><span>$mh</span>,<span>$running</span>); <span>//</span><span>3</span> <span> }</span><span>while</span>(<span>$running</span> > 0); <span>//</span><span>4</span> <span>foreach</span>(<span>$chArr</span> <span>as</span> <span>$k</span> => <span>$ch</span><span>){ </span><span>$result</span>[<span>$k</span>]= curl_multi_getcontent(<span>$ch</span>); <span>//</span><span>5</span> <span> curl_multi_remove_handle(</span><span>$mh</span>,<span>$ch</span>);<span>//</span><span>6</span> <span> } curl_multi_close(</span><span>$mh</span>); <span>//</span><span>7</span> ?>
cURL批处理时内存占用过多的问题
但是,执行大批量的句柄时我们会发现一个很严重的问题,那就是执行时系统CPU占用率几乎100%,几乎是死机状态了。纠其原因,那是因为在$running>0,执行 curl_multi_exec($mh,$running)而整个批处理句柄没有全部执行完毕时,系统会不停地执行curl_multi_exec()函数。我们用实验来证明:
我们在循环中curl_multi_exec($mh,$running)句前加入一个echo "a";的语句。我们的目的是执行50次对百度的访问,然后来看一下结果。
从图中滚动条的大小(滚动条已经最小状态了)可以大概看出输出a的个数,500个也不止,所以我们便可以找到占用CPU的罪魁祸首了。
cURL批处理时的内存优化方案
进行改动的方式是应用curl函数库中的curl_multi_select()函数,其函数原型如下:
<p>int curl_multi_select ( resource $mh [, float $timeout = 1.0 ] )</p> <p>阻塞直到cURL批处理连接中有活动连接。成功时返回描述符集合中描述符的数量。失败时,select失败时返回-1,否则返回超时(从底层的select系统调用)。</p>
我用们curl_multi_select()函数来达到没有需要读取的程序就阻塞住的目的。
我们对批处理的第3、4步进行优化,利用其多线程,模拟并发程序。
很多朋友会对手册中提供的代码心存疑惑(我一开始也是),下面的代码及解释。
<span>$running</span> = <span>null</span><span>; </span><span>do</span><span> { </span><span>$mrc</span> = curl_multi_exec(<span>$mh</span>, <span>$running</span><span>); } </span><span>while</span> (<span>$mrc</span> ==<span> CURLM_CALL_MULTI_PERFORM); </span><span>//</span><span>本次循环第一次处理$mh批处理中的$ch句柄,并将$mh批处理的执行状态写入$running,当状态值等于CURLM_CALL_MULTI_PERFORM时,表明数据还在写入或读取中,执行循环,当第一次$ch句柄的数据写入或读取成功后,状态值变为CURLM_OK,跳出本次循环,进入下面的大循环之中。 //$running为true,即$mh批处理之中还有$ch句柄正待处理,$mrc==CURLM_OK,即上一次$ch句柄的读取或写入已经执行完毕。</span> <span>while</span> (<span>$running</span> && <span>$mrc</span> ==<span> CURLM_OK) { </span><span>if</span> (curl_multi_select(<span>$mh</span>) != -1) {<span>//</span><span>$mh批处理中还有可执行的$ch句柄,curl_multi_select($mh) != -1程序退出阻塞状态。</span> <span>do</span> { <span>//</span><span>继续执行需要处理的$ch句柄。</span> <span>$mrc</span> = curl_multi_exec(<span>$mh</span>, <span>$running</span><span>); } </span><span>while</span> (<span>$mrc</span> ==<span> CURLM_CALL_MULTI_PERFORM); } }</span>
这样执行的好处是$mh批处理中的$ch句柄会在读取或写入数据结束后($mrc==CURLM_OK),进入curl_multi_select($mh)的阻塞阶段,而不会在整个$mh批处理执行时不停地执行curl_multi_exec,白白浪费CPU资源。
cURL批处理的内存优化结果
完整代码如下:
<?<span>php </span><span>$chArr</span>=<span>[]; </span><span>for</span>(<span>$i</span>=0;<span>$i</span><50;<span>$i</span>++<span>){ </span><span>$chArr</span>[<span>$i</span>]=curl_init("http://www.baidu.com"<span>); curl_setopt(</span><span>$chArr</span>[<span>$i</span>],CURLOPT_RETURNTRANSFER,1<span>); } </span><span>$mh</span> =<span> curl_multi_init(); </span><span>foreach</span>(<span>$chArr</span> <span>as</span> <span>$k</span> => <span>$ch</span><span>) curl_multi_add_handle(</span><span>$mh</span>,<span>$ch</span><span>); <br /> </span><span>$running</span> = <span>null</span><span>; </span><span>do</span><span> { <br /></span><span> $mrc</span> = curl_multi_exec(<span>$mh</span>, <span>$running</span><span>); } </span><span>while</span> (<span>$mrc</span> ==<span> CURLM_CALL_MULTI_PERFORM); </span><span>while</span> (<span>$running</span> && <span>$mrc</span> ==<span> CURLM_OK) { </span><span>if</span> (curl_multi_select(<span>$mh</span>) != -1<span>) { </span><span>do</span><span> { </span><span>$mrc</span> = curl_multi_exec(<span>$mh</span>, <span>$running</span><span>); } </span><span>while</span> (<span>$mrc</span> ==<span> CURLM_CALL_MULTI_PERFORM); } } </span><span>foreach</span>(<span>$chArr</span> <span>as</span> <span>$k</span> => <span>$ch</span><span>){ </span><span>$result</span>[<span>$k</span>]= curl_multi_getcontent(<span>$ch</span><span>); curl_multi_remove_handle(</span><span>$mh</span>,<span>$ch</span><span>); } curl_multi_close(</span><span>$mh</span><span>); </span>?>
我们再次在 $mrc = curl_multi_exec($mh, $running)句子前加入echo "a";结果如下图:
虽然也不止50次,但是比之未优化前,CPU使用率已经大为改观。
虽然curl函数非常强大,但是我们还是有使用其他函数来发送POST请求的机会,另外也能从更底层了解curl函数,所以本辑也用大很大篇幅在其他函数上。
OK,本辑结束,写这辑博文的同时,我也学习到了很多。如果您觉得本博文对您有帮助,请您点推荐或关注我,我们继续分享我的笔记总结。如果有什么问题,您可以在下方留言讨论,谢谢阅读。

What’s still popular is the ease of use, flexibility and a strong ecosystem. 1) Ease of use and simple syntax make it the first choice for beginners. 2) Closely integrated with web development, excellent interaction with HTTP requests and database. 3) The huge ecosystem provides a wealth of tools and libraries. 4) Active community and open source nature adapts them to new needs and technology trends.

PHP and Python are both high-level programming languages that are widely used in web development, data processing and automation tasks. 1.PHP is often used to build dynamic websites and content management systems, while Python is often used to build web frameworks and data science. 2.PHP uses echo to output content, Python uses print. 3. Both support object-oriented programming, but the syntax and keywords are different. 4. PHP supports weak type conversion, while Python is more stringent. 5. PHP performance optimization includes using OPcache and asynchronous programming, while Python uses cProfile and asynchronous programming.

PHP is mainly procedural programming, but also supports object-oriented programming (OOP); Python supports a variety of paradigms, including OOP, functional and procedural programming. PHP is suitable for web development, and Python is suitable for a variety of applications such as data analysis and machine learning.

PHP originated in 1994 and was developed by RasmusLerdorf. It was originally used to track website visitors and gradually evolved into a server-side scripting language and was widely used in web development. Python was developed by Guidovan Rossum in the late 1980s and was first released in 1991. It emphasizes code readability and simplicity, and is suitable for scientific computing, data analysis and other fields.

PHP is suitable for web development and rapid prototyping, and Python is suitable for data science and machine learning. 1.PHP is used for dynamic web development, with simple syntax and suitable for rapid development. 2. Python has concise syntax, is suitable for multiple fields, and has a strong library ecosystem.

PHP remains important in the modernization process because it supports a large number of websites and applications and adapts to development needs through frameworks. 1.PHP7 improves performance and introduces new features. 2. Modern frameworks such as Laravel, Symfony and CodeIgniter simplify development and improve code quality. 3. Performance optimization and best practices further improve application efficiency.

PHPhassignificantlyimpactedwebdevelopmentandextendsbeyondit.1)ItpowersmajorplatformslikeWordPressandexcelsindatabaseinteractions.2)PHP'sadaptabilityallowsittoscaleforlargeapplicationsusingframeworkslikeLaravel.3)Beyondweb,PHPisusedincommand-linescrip

PHP type prompts to improve code quality and readability. 1) Scalar type tips: Since PHP7.0, basic data types are allowed to be specified in function parameters, such as int, float, etc. 2) Return type prompt: Ensure the consistency of the function return value type. 3) Union type prompt: Since PHP8.0, multiple types are allowed to be specified in function parameters or return values. 4) Nullable type prompt: Allows to include null values and handle functions that may return null values.


Hot AI Tools

Undresser.AI Undress
AI-powered app for creating realistic nude photos

AI Clothes Remover
Online AI tool for removing clothes from photos.

Undress AI Tool
Undress images for free

Clothoff.io
AI clothes remover

Video Face Swap
Swap faces in any video effortlessly with our completely free AI face swap tool!

Hot Article

Hot Tools

MantisBT
Mantis is an easy-to-deploy web-based defect tracking tool designed to aid in product defect tracking. It requires PHP, MySQL and a web server. Check out our demo and hosting services.

mPDF
mPDF is a PHP library that can generate PDF files from UTF-8 encoded HTML. The original author, Ian Back, wrote mPDF to output PDF files "on the fly" from his website and handle different languages. It is slower than original scripts like HTML2FPDF and produces larger files when using Unicode fonts, but supports CSS styles etc. and has a lot of enhancements. Supports almost all languages, including RTL (Arabic and Hebrew) and CJK (Chinese, Japanese and Korean). Supports nested block-level elements (such as P, DIV),

Dreamweaver CS6
Visual web development tools

DVWA
Damn Vulnerable Web App (DVWA) is a PHP/MySQL web application that is very vulnerable. Its main goals are to be an aid for security professionals to test their skills and tools in a legal environment, to help web developers better understand the process of securing web applications, and to help teachers/students teach/learn in a classroom environment Web application security. The goal of DVWA is to practice some of the most common web vulnerabilities through a simple and straightforward interface, with varying degrees of difficulty. Please note that this software

ZendStudio 13.5.1 Mac
Powerful PHP integrated development environment