


Performance comparison of using file_get_content series functions and using curl series functions to collect images, curl function_PHP tutorial
Comparison of the performance of using the file_get_content series of functions and using the curl series of functions to collect images. The curl function
Since the car content in the background of a car website of the company mainly comes from cars For Home, editor colleagues have to manually add cars to Autohome every day, which is really a pain in the ass. Ever since, in order to change this situation, as a development coder, my task has come. . . That is to prepare a function. As long as you paste the corresponding car home URL, the data can be automatically filled into the form in our backend. At present, the basic filling has been implemented, but the corresponding car photo album is still not collected. Come in.
I have done the function of collecting pictures before, but most of the cars in Autohome have a lot of pictures. At the beginning, I planned to use the previous method of collecting pictures, that is, use file_get_content to get the corresponding URL. content, then match the address of the image, then use file_get_content to obtain the content of these image URLs, and then load it locally. The code is as follows:
<?<span>php </span><span>header</span>('Content-type:text/html;charset=utf-8'<span>); </span><span>set_time_limit</span>(0<span>); </span><span>class</span><span> runtime { </span><span>var</span> <span>$StartTime</span> = 0<span>; </span><span>var</span> <span>$StopTime</span> = 0<span>; </span><span>function</span><span> get_microtime() { </span><span>list</span>(<span>$usec</span>, <span>$sec</span>) = <span>explode</span>(' ', <span>microtime</span><span>()); </span><span>return</span> ((<span>float</span>)<span>$usec</span> + (<span>float</span>)<span>$sec</span><span>); } </span><span>function</span><span> start() { </span><span>$this</span>->StartTime = <span>$this</span>-><span>get_microtime(); } </span><span>function</span><span> stop() { </span><span>$this</span>->StopTime = <span>$this</span>-><span>get_microtime(); } </span><span>function</span><span> spent() { </span><span>return</span> <span>round</span>((<span>$this</span>->StopTime - <span>$this</span>->StartTime) * 1000, 1<span>); } } </span><span>$runtime</span>= <span>new</span><span> runtime(); </span><span>$runtime</span>-><span>start(); </span><span>$url</span> = 'http://car.autohome.com.cn/pic/series-s15306/289.html#pvareaid=102177'<span>; </span><span>$rs</span> = <span>file_get_contents</span>(<span>$url</span><span>); </span><span>//</span><span> echo $rs;exit;</span> <span>preg_match_all</span>('/(\/pic\/series-s15306\/289-\d+\.html)/', <span>$rs</span>, <span>$urlArr</span><span>); </span><span>$avalie</span> = <span>array_unique</span>(<span>$urlArr</span>[0<span>]); </span><span>$count</span> = <span>array</span><span>(); </span><span>foreach</span> (<span>$avalie</span> <span>as</span> <span>$key</span> => <span>$ul</span><span>) { </span><span>$pattern</span> = '/<img src="/static/imghwm/default1.png" data-src="(http:\/\/car1\.autoimg\.cn\/upload\/\d+\/\d+\/\d+\/.*?\.jpg)" class="lazy"/'<span alt="Performance comparison of using file_get_content series functions and using curl series functions to collect images, curl function_PHP tutorial" >; </span><span>preg_match_all</span>(<span>$pattern</span>, <span>file_get_contents</span>('http://car.autohome.com.cn'.<span>$ul</span>), <span>$imgSrc</span><span>); </span><span>$count</span> = <span>array_merge</span>(<span>$count</span>, <span>$imgSrc</span>[1<span>]); } </span><span>foreach</span>(<span>$count</span> <span>as</span> <span>$k</span>=><span>$v</span><span>) { </span><span>$data</span>[<span>$k</span>] = <span>file_get_contents</span>(<span>$v</span><span>); } </span><span>foreach</span>(<span>$data</span> <span>as</span> <span>$k</span>=><span>$v</span><span>) { </span><span>file_put_contents</span>('./pic2/'.<span>time</span>().'_'.<span>rand</span>(1, 10000).'.jpg', <span>$v</span><span>); } </span><span>$runtime</span>-><span>stop(); </span><span>echo</span> "页面执行时间: ".<span>$runtime</span>->spent()." 毫秒";
It turns out that this method is better with fewer pictures, but it is quite laggy if there are too many pictures. . It is also difficult to run local tests, let alone go online when the time comes. After Baidu, I used the curl method to download images. After testing, it did improve, but it still felt a bit slow. It would be great if PHP had multiple threads. . .
After some tossing and looking for information, I found that the curl library of PHP can actually simulate multi-threading, that is, using the curl_multi_* series of functions. After rewriting, the code became like this:
<?<span>php </span><span>header</span>('Content-type:text/html;charset=utf-8'<span>); </span><span>set_time_limit</span>(0<span>); </span><span>class</span><span> runtime { </span><span>var</span> <span>$StartTime</span> = 0<span>; </span><span>var</span> <span>$StopTime</span> = 0<span>; </span><span>function</span><span> get_microtime() { </span><span>list</span>(<span>$usec</span>, <span>$sec</span>) = <span>explode</span>(' ', <span>microtime</span><span>()); </span><span>return</span> ((<span>float</span>)<span>$usec</span> + (<span>float</span>)<span>$sec</span><span>); } </span><span>function</span><span> start() { </span><span>$this</span>->StartTime = <span>$this</span>-><span>get_microtime(); } </span><span>function</span><span> stop() { </span><span>$this</span>->StopTime = <span>$this</span>-><span>get_microtime(); } </span><span>function</span><span> spent() { </span><span>return</span> <span>round</span>((<span>$this</span>->StopTime - <span>$this</span>->StartTime) * 1000, 1<span>); } } </span><span>$runtime</span>= <span>new</span><span> runtime(); </span><span>$runtime</span>-><span>start(); </span><span>$url</span> = 'http://car.autohome.com.cn/pic/series-s15306/289.html#pvareaid=102177'<span>; </span><span>$rs</span> = <span>file_get_contents</span>(<span>$url</span><span>); </span><span>preg_match_all</span>('/(\/pic\/series-s15306\/289-\d+\.html)/', <span>$rs</span>, <span>$urlArr</span><span>); </span><span>$avalie</span> = <span>array_unique</span>(<span>$urlArr</span>[0<span>]); </span><span>$count</span> = <span>array</span><span>(); </span><span>foreach</span> (<span>$avalie</span> <span>as</span> <span>$key</span> => <span>$ul</span><span>) { </span><span>$pattern</span> = '/<img src="/static/imghwm/default1.png" data-src="(http:\/\/car1\.autoimg\.cn\/upload\/\d+\/\d+\/\d+\/.*?\.jpg)" class="lazy"/'<span alt="Performance comparison of using file_get_content series functions and using curl series functions to collect images, curl function_PHP tutorial" >; </span><span>preg_match_all</span>(<span>$pattern</span>, <span>file_get_contents</span>('http://car.autohome.com.cn'.<span>$ul</span>), <span>$imgSrc</span><span>); </span><span>$count</span> = <span>array_merge</span>(<span>$count</span>, <span>$imgSrc</span>[1<span>]); } </span><span>$handle</span> =<span> curl_multi_init(); </span><span>foreach</span>(<span>$count</span> <span>as</span> <span>$k</span> => <span>$v</span><span>) { </span><span>$curl</span>[<span>$k</span>] = curl_init(<span>$v</span><span>); curl_setopt(</span><span>$curl</span>[<span>$k</span>], CURLOPT_RETURNTRANSFER, 1<span>); curl_setopt(</span><span>$curl</span>[<span>$k</span>], CURLOPT_HEADER, 0<span>); curl_setopt(</span><span>$curl</span>[<span>$k</span>], CURLOPT_TIMEOUT, 30<span>); curl_multi_add_handle (</span><span>$handle</span>, <span>$curl</span>[<span>$k</span><span>]); } </span><span>$active</span> = <span>null</span><span>; </span><span>do</span><span> { </span><span>$mrc</span> = curl_multi_exec(<span>$handle</span>, <span>$active</span><span>); } </span><span>while</span> (<span>$mrc</span> ==<span> CURLM_CALL_MULTI_PERFORM); </span><span>while</span> (<span>$active</span> && <span>$mrc</span> ==<span> CURLM_OK) { // 这句在php5.3以后的版本很关键,因为没有这句,可能curl_multi_select可能会永远返回-1,这样就永远死在循环里了 </span><span>while</span> (curl_multi_exec(<span>$handle</span>, <span>$active</span>) ===<span> CURLM_CALL_MULTI_PERFORM); </span><span>if</span> (curl_multi_select(<span>$handle</span>) != -1<span>) { </span><span>do</span><span> { </span><span>$mrc</span> = curl_multi_exec(<span>$handle</span>, <span>$active</span><span>); } </span><span>while</span> (<span>$mrc</span> ==<span> CURLM_CALL_MULTI_PERFORM); } } </span><span>foreach</span> (<span>$curl</span> <span>as</span> <span>$k</span> => <span>$v</span><span>) { </span><span>if</span> (curl_error(<span>$curl</span>[<span>$k</span>]) == ""<span>) { </span><span>$data</span>[<span>$k</span>] = curl_multi_getcontent(<span>$curl</span>[<span>$k</span><span>]); } curl_multi_remove_handle(</span><span>$handle</span>, <span>$curl</span>[<span>$k</span><span>]); curl_close(</span><span>$curl</span>[<span>$k</span><span>]); } </span><span>foreach</span>(<span>$data</span> <span>as</span> <span>$k</span>=><span>$v</span><span>) { </span><span>$file</span> = <span>time</span>().'_'.<span>rand</span>(1000, 9999).'.jpg'<span>; </span><span>file_put_contents</span>('./pic3/'.<span>$file</span>, <span>$v</span><span>); } curl_multi_close(</span><span>$handle</span><span>); </span><span>$runtime</span>-><span>stop(); </span><span>echo</span> "页面执行时间: ".<span>$runtime</span>->spent()." 毫秒";
Well, multi-threaded collection is really refreshing. After a series of tests and comparisons, out of 5 tests, curl multi-threading was faster than file_get_content 4 times, and the time was still 3 to 5 times that of file_get_content. To sum up, this method will be used as much as possible in future collections to improve efficiency.

What’s still popular is the ease of use, flexibility and a strong ecosystem. 1) Ease of use and simple syntax make it the first choice for beginners. 2) Closely integrated with web development, excellent interaction with HTTP requests and database. 3) The huge ecosystem provides a wealth of tools and libraries. 4) Active community and open source nature adapts them to new needs and technology trends.

PHP and Python are both high-level programming languages that are widely used in web development, data processing and automation tasks. 1.PHP is often used to build dynamic websites and content management systems, while Python is often used to build web frameworks and data science. 2.PHP uses echo to output content, Python uses print. 3. Both support object-oriented programming, but the syntax and keywords are different. 4. PHP supports weak type conversion, while Python is more stringent. 5. PHP performance optimization includes using OPcache and asynchronous programming, while Python uses cProfile and asynchronous programming.

PHP is mainly procedural programming, but also supports object-oriented programming (OOP); Python supports a variety of paradigms, including OOP, functional and procedural programming. PHP is suitable for web development, and Python is suitable for a variety of applications such as data analysis and machine learning.

PHP originated in 1994 and was developed by RasmusLerdorf. It was originally used to track website visitors and gradually evolved into a server-side scripting language and was widely used in web development. Python was developed by Guidovan Rossum in the late 1980s and was first released in 1991. It emphasizes code readability and simplicity, and is suitable for scientific computing, data analysis and other fields.

PHP is suitable for web development and rapid prototyping, and Python is suitable for data science and machine learning. 1.PHP is used for dynamic web development, with simple syntax and suitable for rapid development. 2. Python has concise syntax, is suitable for multiple fields, and has a strong library ecosystem.

PHP remains important in the modernization process because it supports a large number of websites and applications and adapts to development needs through frameworks. 1.PHP7 improves performance and introduces new features. 2. Modern frameworks such as Laravel, Symfony and CodeIgniter simplify development and improve code quality. 3. Performance optimization and best practices further improve application efficiency.

PHPhassignificantlyimpactedwebdevelopmentandextendsbeyondit.1)ItpowersmajorplatformslikeWordPressandexcelsindatabaseinteractions.2)PHP'sadaptabilityallowsittoscaleforlargeapplicationsusingframeworkslikeLaravel.3)Beyondweb,PHPisusedincommand-linescrip

PHP type prompts to improve code quality and readability. 1) Scalar type tips: Since PHP7.0, basic data types are allowed to be specified in function parameters, such as int, float, etc. 2) Return type prompt: Ensure the consistency of the function return value type. 3) Union type prompt: Since PHP8.0, multiple types are allowed to be specified in function parameters or return values. 4) Nullable type prompt: Allows to include null values and handle functions that may return null values.


Hot AI Tools

Undresser.AI Undress
AI-powered app for creating realistic nude photos

AI Clothes Remover
Online AI tool for removing clothes from photos.

Undress AI Tool
Undress images for free

Clothoff.io
AI clothes remover

Video Face Swap
Swap faces in any video effortlessly with our completely free AI face swap tool!

Hot Article

Hot Tools

Dreamweaver CS6
Visual web development tools

SAP NetWeaver Server Adapter for Eclipse
Integrate Eclipse with SAP NetWeaver application server.

MantisBT
Mantis is an easy-to-deploy web-based defect tracking tool designed to aid in product defect tracking. It requires PHP, MySQL and a web server. Check out our demo and hosting services.

Zend Studio 13.0.1
Powerful PHP integrated development environment

PhpStorm Mac version
The latest (2018.2.1) professional PHP integrated development tool