Home >Backend Development >PHP Tutorial >怎么用PHP抓取网站HTML

怎么用PHP抓取网站HTML

WBOYWBOYWBOYWBOYWBOYWBOYWBOYWBOYWBOYWBOYWBOYWBOYWBOriginal: 2016-06-23 13:46:561003browse

连接地址

http://detail.tmall.com/item.htm?spm=a230r.1.0.0.MlI5e4&id=40364502055&ad_id=&am_id=&cm_id=140105335569ed55e27b&pm_id=&abbucket=12

抓取上面连接的 HTML 用file_get_contents() 测试没成功怎么回事啊？

回复讨论(解决方案)

file_get_contents() 成功了呀

你可以采用楼上的写法也可以采用curl来获取，最重要的是要看你啥需求。

查一下php手册中的curl

多测试几次filegetcontents，实在不行就curl
一般就是伪造useragent和referer，也许再带个cookie

可以抓取到的还可以根据对应的字符闭合段去进行抓取

<?php 	$url="http://detail.tmall.com/item.htm?spm=a230r.1.0.0.MlI5e4&id=40364502055&ad_id=&am_id=&cm_id=140105335569ed55e27b&pm_id";	$content = getcurl($url);	echo $content;	    function getcurl($url){		$ch = curl_init(); 		curl_setopt($ch, CURLOPT_URL, $url);		curl_setopt($ch, CURLOPT_RETURNTRANSFER, 1);		curl_setopt($ch, CURLOPT_FOLLOWLOCATION,true);		curl_setopt($ch, CURLOPT_MAXREDIRS,20);		$file_contents = curl_exec($ch);		return $file_contents;		curl_close($ch);    }?>

其中curl_setopt($ch, CURLOPT_FOLLOWLOCATION,true);设置比较重要，可以用来跟随天猫的重定向页面。

非常感谢你啊

Statement：

The content of this article is voluntarily contributed by netizens, and the copyright belongs to the original author. This site does not assume corresponding legal responsibility. If you find any content suspected of plagiarism or infringement, please contact admin@php.cn

Previous article：写一个属于自己的PHP的MVC框架（二）Next article：301跳转出错

See more

怎么用PHP抓取网站HTML

回复讨论(解决方案)

Related articles