Home  >  Article  >  Backend Development  >  How to use phpcurl to implement multi-process download file classes

How to use phpcurl to implement multi-process download file classes

jacklove
jackloveOriginal
2018-06-08 16:55:332573browse

Batch download files generally use a loop method to download files one by one. However, if bandwidth and server performance permit, using multiple processes for downloading can greatly improve downloading efficiency. This article introduces PHP's multi-process request method using curl to achieve simultaneous downloading of files by multiple processes.

Principle:

Use curl's batch processing method to open multiple processes to download files in batches.

Main method:

curl_multi_init
Return a new cURL batch handle

curl_multi_add_handle
Add a separate curl handle to the curl batch session

curl_multi_exec
Run the current cURL handle Sub-connection

curl_multi_getcontent
If CURLOPT_RETURNTRANSFER is set, returns the text stream of the obtained output

curl_multi_remove_handle
Remove a handle resource in the curl batch handle resource

curl_multi_close
Close a group of cURL handles

The complete code is as follows:

BatchDownLoad.class.php

<?php/**
 * 多进程批量下载文件(使用php curl_multi_exec实现)
 * Date:    2017-07-16
 * Author:  fdipzone
 * Version: 1.0
 *
 * Func
 * public  download 下载处理
 * public  process  多进程下载
 * private to_log   将执行结果写入日志文件
 */class BatchDownLoad {

    // 下载文件设置
    private $download_config = array();    // 最大开启进程数量
    private $max_process_num = 10;    // 超时秒数
    private $timeout = 10;    // 日志文件
    private $logfile = null;    /**
     * 初始化
     * @param  Array  $download_config   下载的文件设置
     * @param  Int    $max_process_num   最大开启的进程数量
     * @param  Int    $timeout           超时秒数
     * @param  String $logfile           日志文件路径
     */
    public function __construct($download_config, $max_process_num=10, $timeout=10, $logfile=&#39;&#39;){
        $this->download_config = $download_config;        $this->max_process_num = $max_process_num;        $this->timeout = $timeout;        // 日志文件
        if($logfile){            $this->logfile = $logfile;
        }else{            $this->logfile = dirname(__FILE__).&#39;/batch_download_&#39;.date(&#39;Ymd&#39;).&#39;.log&#39;;
        }
    }    /**
     * 执行下载
     * @result Int
     */
    public function download(){

        // 已处理的数量
        $handle_num = 0;        // 未处理完成
        while(count($this->download_config)>0){            // 需要处理的大于最大进程数
            if(count($this->download_config)>$this->max_process_num){                $process_num = $this->max_process_num;            // 需要处理的小于最大进程数
            }else{                $process_num = count($this->download_config);
            }            // 抽取指定数量进行下载
            $tmp_download_config = array_splice($this->download_config, 0, $process_num);            // 执行下载
            $result = $this->process($tmp_download_config);            // 写入日志
            $this->to_log($tmp_download_config, $result);            // 记录已处理的数量
            $handle_num += count($result);

        }        return $handle_num;

    }    /**
     * 多进程下载文件
     * @param  Array $download_config 本次下载的设置
     * @return Array
     */
    public function process($download_config){

        // 文件资源
        $fp = array();        // curl会话
        $ch = array();        // 执行结果
        $result = array();        // 创建curl handle
        $mh = curl_multi_init();        // 循环设定数量
        foreach($download_config as $k=>$config){            $ch[$k] = curl_init();            $fp[$k] = fopen($config[1], &#39;a&#39;);

            curl_setopt($ch[$k], CURLOPT_URL, $config[0]);
            curl_setopt($ch[$k], CURLOPT_FILE, $fp[$k]);
            curl_setopt($ch[$k], CURLOPT_HEADER, 0);
            curl_setopt($ch[$k], CURLOPT_RETURNTRANSFER, true);
            curl_setopt($ch[$k], CURLOPT_USERAGENT, &#39;Mozilla/4.0 (compatible; MSIE 7.0; Windows NT 6.0)&#39;);            // 加入处理
            curl_multi_add_handle($mh, $ch[$k]);
        }        $active = null;        do{            $mrc = curl_multi_exec($mh, $active);
        } while($active);        // 获取数据
        foreach($fp as $k=>$v){
            fwrite($v, curl_multi_getcontent($ch[$k]));
        }        // 关闭curl handle与文件资源
        foreach($download_config as $k=>$config){
            curl_multi_remove_handle($mh, $ch[$k]);
            fclose($fp[$k]);            // 检查是否下载成功
            if(file_exists($config[1])){                $result[$k] = true;
            }else{                $result[$k] = false;
            }
        }

        curl_multi_close($mh);        return $result;

    }    /**
     * 写入日志
     * @param Array $data 下载文件数据
     * @param Array $flag 下载文件状态数据
     */
    private function to_log($data, $flag){

        // 临时日志数据
        $tmp_log = &#39;&#39;;        foreach($data as $k=>$v){            $tmp_log .= &#39;[&#39;.date(&#39;Y-m-d H:i:s&#39;).&#39;] url:&#39;.$v[0].&#39; file:&#39;.$v[1].&#39; status:&#39;.$flag[$k].PHP_EOL;
        }        // 创建日志目录
        if(!is_dir(dirname($this->logfile))){
            mkdir(dirname($this->logfile), 0777, true);
        }        // 写入日志文件
        file_put_contents($this->logfile, $tmp_log, FILE_APPEND);
    }

}?>

demo.php

<?phprequire &#39;BatchDownLoad.class.php&#39;;$base_path = dirname(__FILE__).&#39;/photo&#39;;$download_config = array(    array(&#39;http://www.example.com/p1.jpg&#39;, $base_path.&#39;/p1.jpg&#39;),    array(&#39;http://www.example.com/p2.jpg&#39;, $base_path.&#39;/p2.jpg&#39;),    array(&#39;http://www.example.com/p3.jpg&#39;, $base_path.&#39;/p3.jpg&#39;),    array(&#39;http://www.example.com/p4.jpg&#39;, $base_path.&#39;/p4.jpg&#39;),    array(&#39;http://www.example.com/p5.jpg&#39;, $base_path.&#39;/p5.jpg&#39;),
);$obj = new BatchDownLoad($download_config, 2, 10);$handle_num = $obj->download();echo &#39;download num:&#39;.$handle_num.PHP_EOL;?>

Log output after execution

[2017-07-16 18:04:21] url:http://www.example.com/p1.jpg file:/home/fdipzone/photo/p1.jpg status:1[2017-07-16 18:04:21] url:http://www.example.com/p2.jpg file:/home/fdipzone/photo/p2.jpg status:1[2017-07-16 18:04:21] url:http://www.example.com/p3.jpg file:/home/fdipzone/photo/p3.jpg status:1[2017-07-16 18:04:21] url:http://www.example.com/p4.jpg file:/home/fdipzone/photo/p4.jpg status:1[2017-07-16 18:04:21] url:http://www.example.com/p5.jpg file:/home/fdipzone/photo/p5.jpg status:1

This article explains some ways to download files in batches. For more related content, please pay attention to php Chinese website .

Related recommendations:

How to determine whether local and remote files exist through php

About the mysql table data row and column conversion method Explanation

Explanation about php log class

The above is the detailed content of How to use phpcurl to implement multi-process download file classes. For more information, please follow other related articles on the PHP Chinese website!

Statement:
The content of this article is voluntarily contributed by netizens, and the copyright belongs to the original author. This site does not assume corresponding legal responsibility. If you find any content suspected of plagiarism or infringement, please contact admin@php.cn