Home >Backend Development >PHP Tutorial >Solution to file_get_contents in php to obtain garbled web pages_PHP tutorial

Solution to file_get_contents in php to obtain garbled web pages_PHP tutorial

WBOY
WBOYOriginal
2016-07-13 10:57:371105browse

Yesterday when I was doing a simple collection function, I directly used the file_get_contents function, but there was no problem in collecting some websites, and some network chips were collected. Later, I analyzed that the garbled code appeared because the server had turned on the gzip compression function.

A page I collected, as follows gzip

It will be easier for us to solve it once we know the reason. We first searched on Baidu and found out that we can use curl operation instead.

curl solution

The code is as follows
 代码如下 复制代码

function curl_get($url, $gzip=false){
        $curl = curl_init($url);
        curl_setopt($curl, CURLOPT_RETURNTRANSFER, 1);
        curl_setopt($curl, CURLOPT_CONNECTTIMEOUT, 10);
        if($gzip) curl_setopt($curl, CURLOPT_ENCODING, "gzip"); // 关键在这里
        $content = curl_exec($curl);
        curl_close($curl);
        return $content;
}

Copy code

function curl_get($url, $gzip=false){
          $curl = curl_init($url);
curl_setopt($curl, CURLOPT_RETURNTRANSFER, 1);
curl_setopt($curl, CURLOPT_CONNECTTIMEOUT, 10);
If($gzip) curl_setopt($curl, CURLOPT_ENCODING, "gzip"); // The key is here
          $content = curl_exec($curl);
          curl_close($curl);
         return $content;
}

 代码如下 复制代码

file_get_contents("compress.zlib://".$url);

Adopt gzip encoding format

file_get_contents solution:

The code is as follows Copy code

file_get_contents("compress.zlib://".$url);

 代码如下 复制代码
php_curl.dll

The above code will work regardless of whether the page is gzip compressed or not!

Note: CURL needs to be turned on.
 代码如下 复制代码

# wget http://curl.haxx.se/download/curl-7.17.1.tar.gz

# tar zxvf curl-7.17.1.tar.gz  //解压

#cd curl-7.17.1

# ./configure –prefix=/usr/local/curl

# make

# make install

curl installation:
Installation under xp

: Modify the settings of the php.ini file and find

//Cancel the comment below extension=php_curl.dll Installation under linux:
The code is as follows Copy code
# wget http://curl.haxx.se/download/curl-7.17.1.tar.gz # tar zxvf curl-7.17.1.tar.gz //Extract #cd curl-7.17.1 # ./configure –prefix=/usr/local/curl
# make # make install This is the method to install before installing php. http://www.bkjia.com/PHPjc/632084.htmlwww.bkjia.comtruehttp: //www.bkjia.com/PHPjc/632084.htmlTechArticleYesterday when I was doing a simple collection function, I directly used the file_get_contents function, but there was no problem in collecting some websites. Some online chips were used, but later it was analyzed that the garbled code was caused by the server...
Statement:
The content of this article is voluntarily contributed by netizens, and the copyright belongs to the original author. This site does not assume corresponding legal responsibility. If you find any content suspected of plagiarism or infringement, please contact admin@php.cn