search
HomeBackend DevelopmentPHP TutorialPHP多线程批量采集上载图片

PHP多线程批量采集下载图片

?

使用curl的多线程,另外curl可以设置请求时间,遇到很慢的url资源,可以果断的放弃,这样没有阻塞,另外有多线程请求,效率应该比较高,参考:《CURL的学习和应用[附多线程]》,我们再来测试一下;

核心代码:

?

/**     * curl 多线程     *     * @param array $array 并行网址     * @param int $timeout 超时时间     * @return mix     */	public function Curl_http($array,$timeout='15'){		    $res = array();		    $mh = curl_multi_init();//创建多个curl语柄		    foreach($array as $k=>$url){		        $conn[$k]=curl_init($url);//初始化		        curl_setopt($conn[$k], CURLOPT_TIMEOUT, $timeout);//设置超时时间		        curl_setopt($conn[$k], CURLOPT_USERAGENT, 'Mozilla/5.0 (compatible; MSIE 5.01; Windows NT 5.0)');		        curl_setopt($conn[$k], CURLOPT_MAXREDIRS, 7);//HTTp定向级别 ,7最高		        curl_setopt($conn[$k], CURLOPT_HEADER, false);//这里不要header,加块效率		        curl_setopt($conn[$k], CURLOPT_FOLLOWLOCATION, 1); // 302 redirect		        curl_setopt($conn[$k], CURLOPT_RETURNTRANSFER,1);//要求结果为字符串且输出到屏幕上				curl_setopt($conn[$k], CURLOPT_HTTPGET, true);		        curl_multi_add_handle ($mh,$conn[$k]);		    }		     //防止死循环耗死cpu 这段是根据网上的写法		        do {		            $mrc = curl_multi_exec($mh,$active);//当无数据,active=true		        } while ($mrc == CURLM_CALL_MULTI_PERFORM);//当正在接受数据时		        while ($active and $mrc == CURLM_OK) {//当无数据时或请求暂停时,active=true		            if (curl_multi_select($mh) != -1) {		                do {		                    $mrc = curl_multi_exec($mh, $active);		                } while ($mrc == CURLM_CALL_MULTI_PERFORM);		            }		        }		    foreach ($array as $k => $url) {		          if(!curl_errno($conn[$k])){		          	$data[$k]=curl_multi_getcontent($conn[$k]);//数据转换为array		          	$header[$k]=curl_getinfo($conn[$k]);//返回http头信息		          	curl_close($conn[$k]);//关闭语柄		          	curl_multi_remove_handle($mh  , $conn[$k]);   //释放资源		          }else{		          	unset($k,$url);		          }		        }		        curl_multi_close($mh);		        return $data;		 }//参数接收$callback = $_GET['callback'];$hrefs = $_GET['hrefs'];$urlarray = explode(',',trim($hrefs,','));$date = date('Ymd',time());//实例化$img = new HttpImg();$stime = $img->getMicrotime();//开始时间$data = $img->Curl_http($urlarray,'20');//列表数据mkdir('./img/'.$date,0777);foreach ((array)$data as $k=>$v){	preg_match_all("/(href|src)=([\"|']?)([^ \"'>]+\.(jpg|png|PNG|JPG|gif))\\2/i", $v, $matches[$k]);	if(count($matches[$k][3])>0){		$dataimg = $img->Curl_http($matches[$k][3],'20');//全部图片数据二进制		$j = 0;		foreach ((array)$dataimg as $kk=>$vv){			if($vv !=''){				$rand = rand(1000,9999);				$basename = time()."_".$rand.".".jpg;//保存为jpg格式的文件				$fname = './img/'.$date."/"."$basename";				file_put_contents($fname, $vv);				$j++;				echo "创建第".$j."张图片"."$fname"."<br/>";			}else{				unset($kk,$vv);			}		}	}else{		unset($matches);	}}$etime = $img->getMicrotime();//结束时间echo "用时".($etime-$stime)."秒";exit;

?

?

测试一下效果

337张图片用时260秒左右,基本上可以做到一秒内就可以采集一张的效果,而且发现图片越到优势采集速度越明显。

image我们可以看一下文件命名:也就可以做到同一时刻可以生成10张图片,

由于采用了20秒请求的时间限制,有些图片生成后有明显不全,也就是图片资源在20秒内未能完全采集,这个时间大家可以自行设置。

?

Statement
The content of this article is voluntarily contributed by netizens, and the copyright belongs to the original author. This site does not assume corresponding legal responsibility. If you find any content suspected of plagiarism or infringement, please contact admin@php.cn
PHP Dependency Injection Container: A Quick StartPHP Dependency Injection Container: A Quick StartMay 13, 2025 am 12:11 AM

APHPDependencyInjectionContainerisatoolthatmanagesclassdependencies,enhancingcodemodularity,testability,andmaintainability.Itactsasacentralhubforcreatingandinjectingdependencies,thusreducingtightcouplingandeasingunittesting.

Dependency Injection vs. Service Locator in PHPDependency Injection vs. Service Locator in PHPMay 13, 2025 am 12:10 AM

Select DependencyInjection (DI) for large applications, ServiceLocator is suitable for small projects or prototypes. 1) DI improves the testability and modularity of the code through constructor injection. 2) ServiceLocator obtains services through center registration, which is convenient but may lead to an increase in code coupling.

PHP performance optimization strategies.PHP performance optimization strategies.May 13, 2025 am 12:06 AM

PHPapplicationscanbeoptimizedforspeedandefficiencyby:1)enablingopcacheinphp.ini,2)usingpreparedstatementswithPDOfordatabasequeries,3)replacingloopswitharray_filterandarray_mapfordataprocessing,4)configuringNginxasareverseproxy,5)implementingcachingwi

PHP Email Validation: Ensuring Emails Are Sent CorrectlyPHP Email Validation: Ensuring Emails Are Sent CorrectlyMay 13, 2025 am 12:06 AM

PHPemailvalidationinvolvesthreesteps:1)Formatvalidationusingregularexpressionstochecktheemailformat;2)DNSvalidationtoensurethedomainhasavalidMXrecord;3)SMTPvalidation,themostthoroughmethod,whichchecksifthemailboxexistsbyconnectingtotheSMTPserver.Impl

How to make PHP applications fasterHow to make PHP applications fasterMay 12, 2025 am 12:12 AM

TomakePHPapplicationsfaster,followthesesteps:1)UseOpcodeCachinglikeOPcachetostoreprecompiledscriptbytecode.2)MinimizeDatabaseQueriesbyusingquerycachingandefficientindexing.3)LeveragePHP7 Featuresforbettercodeefficiency.4)ImplementCachingStrategiessuc

PHP Performance Optimization Checklist: Improve Speed NowPHP Performance Optimization Checklist: Improve Speed NowMay 12, 2025 am 12:07 AM

ToimprovePHPapplicationspeed,followthesesteps:1)EnableopcodecachingwithAPCutoreducescriptexecutiontime.2)ImplementdatabasequerycachingusingPDOtominimizedatabasehits.3)UseHTTP/2tomultiplexrequestsandreduceconnectionoverhead.4)Limitsessionusagebyclosin

PHP Dependency Injection: Improve Code TestabilityPHP Dependency Injection: Improve Code TestabilityMay 12, 2025 am 12:03 AM

Dependency injection (DI) significantly improves the testability of PHP code by explicitly transitive dependencies. 1) DI decoupling classes and specific implementations make testing and maintenance more flexible. 2) Among the three types, the constructor injects explicit expression dependencies to keep the state consistent. 3) Use DI containers to manage complex dependencies to improve code quality and development efficiency.

PHP Performance Optimization: Database Query OptimizationPHP Performance Optimization: Database Query OptimizationMay 12, 2025 am 12:02 AM

DatabasequeryoptimizationinPHPinvolvesseveralstrategiestoenhanceperformance.1)Selectonlynecessarycolumnstoreducedatatransfer.2)Useindexingtospeedupdataretrieval.3)Implementquerycachingtostoreresultsoffrequentqueries.4)Utilizepreparedstatementsforeffi

See all articles

Hot AI Tools

Undresser.AI Undress

Undresser.AI Undress

AI-powered app for creating realistic nude photos

AI Clothes Remover

AI Clothes Remover

Online AI tool for removing clothes from photos.

Undress AI Tool

Undress AI Tool

Undress images for free

Clothoff.io

Clothoff.io

AI clothes remover

Video Face Swap

Video Face Swap

Swap faces in any video effortlessly with our completely free AI face swap tool!

Hot Article

Hot Tools

SublimeText3 English version

SublimeText3 English version

Recommended: Win version, supports code prompts!

SecLists

SecLists

SecLists is the ultimate security tester's companion. It is a collection of various types of lists that are frequently used during security assessments, all in one place. SecLists helps make security testing more efficient and productive by conveniently providing all the lists a security tester might need. List types include usernames, passwords, URLs, fuzzing payloads, sensitive data patterns, web shells, and more. The tester can simply pull this repository onto a new test machine and he will have access to every type of list he needs.

Dreamweaver CS6

Dreamweaver CS6

Visual web development tools

Notepad++7.3.1

Notepad++7.3.1

Easy-to-use and free code editor

SublimeText3 Mac version

SublimeText3 Mac version

God-level code editing software (SublimeText3)