Home  >  Article  >  Backend Development  >  Curl 采集乱码与采集不到 PHP,该怎么处理

Curl 采集乱码与采集不到 PHP,该怎么处理

WBOY
WBOYOriginal
2016-06-13 13:50:301091browse

Curl 采集乱码与采集不到 PHP
PHP程序是用gbk2312编码的:

$url = "http://www.sina.com.cn";//gbk2312编码
//$url = "http://www.163.com";//gbk2312编码
//$url = "http://www.sohu.com";//gbk2312编码
 

  $ch = curl_init($url);
  curl_setopt($ch, CURLOPT_RETURNTRANSFER , true);//返回获取的输出的文本流
  $ret = curl_exec($ch);
  curl_setopt($ch, CURLOPT_TIMEOUT, 1);
  curl_close($ch);
  echo $ret;

?>

在采集sina.com.cn时,是正常的,但是采集163.com时是为空的,采集sohu.com时是丢码的.
这是怎么回事呢?如何解决?有哪位怎么呀?先谢谢了!!!没多少分了,不好意思。

------解决方案--------------------
别的不说,我就是来拿分的.楼主记得给全分

PHP code


$curl=curl_init('http://www.163.com');
curl_setopt($curl,CURLOPT_RETURNTRANSFER,1);
curl_setopt($curl,CURLOPT_USERAGENT,'Mozilla/4.0 (compatible; MSIE 6.0; Windows NT 5.2; SV1; .NET CLR 1.1.4322)');
$html=curl_exec($curl);
var_dump($html);


$curl=curl_init('http://www.sohu.com');
curl_setopt($curl,CURLOPT_RETURNTRANSFER,1);
curl_setopt($curl,CURLOPT_USERAGENT,'Mozilla/4.0 (compatible; MSIE 6.0; Windows NT 5.2; SV1; .NET CLR 1.1.4322)');
$html=curl_exec($curl);
//$html=strstr($html,' 0) {   
   switch ($method) {   
     case 8:   
       // Currently the only supported compression method:   
       $data = gzinflate($body);   
       break;   
     default:   
       // Unknown compression method   
       return false;   
   }   
  } else {   
   // I'm not sure if zero-byte body content is allowed.   
   // Allow it for now...  Do nothing...   
  }   
  
  // Verifiy decompressed size and CRC32:   
  // NOTE: This may fail with large data sizes depending on how   
  //      PHP's integer limitations affect strlen() since $isize   
  //      may be negative for large sizes.   
  if ($isize != strlen($data) || crc32($data) != $datacrc) {   
   // Bad format!  Length or CRC doesn't match!   
   return false;   
  }   
  return $data;   
} <div class="clear">
                 
              
              
        
            </div>
Statement:
The content of this article is voluntarily contributed by netizens, and the copyright belongs to the original author. This site does not assume corresponding legal responsibility. If you find any content suspected of plagiarism or infringement, please contact admin@php.cn