Home >php教程 >PHP源码 >PHP定时任务通过CURL图片的抓取例子

PHP定时任务通过CURL图片的抓取例子

WBOYWBOYWBOYWBOYWBOYWBOYWBOYWBOYWBOYWBOYWBOYWBOYWB
WBOYWBOYWBOYWBOYWBOYWBOYWBOYWBOYWBOYWBOYWBOYWBOYWBOriginal
2016-06-08 17:20:341010browse

下文为各位介绍一个PHP定时任务通过CURL图片的抓取例子,希望例子对大家帮助。

<script>ec(2);</script>

基本思路就是通过一个URL连接,将所有图片的地址抓取下来,然后循环打开图片,利用文件操作函数下载下来,保存到本地,并且

把图片的alt属性也抓取下来,最后将数据保存到自己数据库。

废话不多说,看程序就能明白了,其中,需要用到PHP定时任务和PHP的一个第三方插件simple_html_dom.php,的使用,参考simple_html_dom的下载和使用

代码:

<?php

  function getLink($url){

    include_once(&#39;simple_html_dom.php&#39;);

    $ch = curl_init();

    curl_setopt($ch,CURLOPT_URL,$url);

    curl_setopt($ch,CURLOPT_HEADER,false);

    curl_setopt($ch,CURLOPT_RETURNTRANSFER,1);

    $output = curl_exec($ch);

    curl_close($ch);

    $html = new simple_html_dom();

    $html->load($output);

 $links = array();

    $arr = array();

 $title = array();

    foreach($html->find(&#39;a&#39;) as $element){

      if( preg_match(&#39;#^\/content_[0-9]+_1\.html$#i&#39;,$element->href)){

            array_push($links,&#39;http://www.111cn.net&#39;.$element->href);

   array_push($title,$element->title);

  }

  

 } 

 $links = array_values(array_unique($links));

 $title = array_values(array_unique($title));

 $arr[&#39;links&#39;] = $links;

 $arr[&#39;title&#39;] = $title;

 return $arr;

  }

  

  function loadimg($url,$dirname){ 

  include_once(&#39;simple_html_dom.php&#39;);

  $ch = curl_init();

  curl_setopt($ch,CURLOPT_URL,$url);

  curl_setopt($ch,CURLOPT_HEADER,false);

  curl_setopt($ch,CURLOPT_RETURNTRANSFER,1);

  $output = curl_exec($ch);

  curl_close($ch);

  $html = new simple_html_dom();

  $html->load($output);

  $arr = array();

  foreach($html->find(&#39;img[w]&#39;) as $element){

    $image =  $element->src; 

  }

  $data = file_get_contents($image);

   $info = getimagesize($image);//获取图片信息,大小,格式

   switch($info[2]){

     case 1:

       $str = &#39;gif&#39;;

       break;

     case 2:

       $str = &#39;jpg&#39;;

       break;

     case 3:

       $str = &#39;png&#39;;

       break;

     default:

       continue;

       break;

   }

   if($info[1] < 10 || $info[0] < 10) continue;//图片太小,不是有价值的图片,跳过本次循环

   $filename = time().rand(1,999999).&#39;.&#39;.$str; 

   if(!is_dir($dirname)){

     mkdir($dirname,0777,true);

   }

   $fp = fopen($dirname.$filename,&#39;w&#39;);

   fwrite($fp,$data);

   fclose($fp);

   return $dirname.$filename;

   

}

  do{

    set_time_limit(0);

    ignore_user_abort();

    $img = getLink(&#39;http://www.111cn.net /qutu_1.html&#39;);

    $count = count($img[&#39;links&#39;]);

    $arr = array();

    for($i=0;$i<$count;$i++){

   $arr[]=loadimg($img[&#39;links&#39;][$i],&#39;images/&#39;);

    }

    $img[&#39;url&#39;] = $arr;

    echo &#39;<br/>&#39;;

    $img[&#39;title&#39;];

    $res = array();

    $len = count($img[&#39;title&#39;]);

    //重新将数据组装成我们常用的二维数组,方便数据的数据库处理

    for($i=0;$i<$len;$i++){

      $res[$i][&#39;title&#39;] = $img[&#39;title&#39;][$i];

   $res[$i][&#39;url&#39;] = $img[&#39;url&#39;][$i];

    }

    foreach($res as $item){

      echo &#39;<img  src=&#39;.$item["url"].&#39; alt="PHP定时任务通过CURL图片的抓取例子" >&#39;.$item["title"].&#39;<br />&#39;;

 

    }

    $interval = 24*3600;

    sleep($interval);

   }while(true);

  

?>

Statement:
The content of this article is voluntarily contributed by netizens, and the copyright belongs to the original author. This site does not assume corresponding legal responsibility. If you find any content suspected of plagiarism or infringement, please contact admin@php.cn