Home >Backend Development >PHP Tutorial >可否帮忙写一个单页页的PHP采集程序,并附上实例,该怎么解决

可否帮忙写一个单页页的PHP采集程序,并附上实例,该怎么解决

WBOY
WBOYOriginal
2016-06-13 10:28:121107browse

可否帮忙写一个单页页的PHP采集程序,并附上实例
比方说,我要采集这个页面:http://news.163.com/12/0613/20/83TJ7PA700014JB6.html

要求:
采集标题
采集正文

谢谢!

------解决方案--------------------
首先去http://simplehtmldom.sourceforge.net/index.htm(点击Download latest version form Sourceforge.)下载一个simple_html_dom.php,傻瓜式的正则,另官网上有详细教程,很容易看懂。

PHP code
header("Content-type: text/html; charset=gb2312");require dirname(__FILE__) . '/simple_html_dom.php';$ch = curl_init();curl_setopt($ch, CURLOPT_URL, 'http://news.163.com/12/0613/20/83TJ7PA700014JB6.html');curl_setopt($ch, CURLOPT_RETURNTRANSFER, 1);curl_setopt($ch, CURLOPT_USERAGENT, 'Mozilla/5.0 (Windows; U; Windows NT 6.1; en-US; rv:1.9.1.2) Gecko/20090729 Firefox/3.5.2 GTB5');$htmls = curl_exec($ch);curl_close($ch);$html = str_get_html($htmls);foreach($html->find('#h1title') as $title){        echo strip_tags($title).'<br>';//标题}foreach($html->find('#endText') as $content){     echo strip_tags($content);//正文}<br><font color="#e78608">------解决方案--------------------</font><br>PHP获取QQ邮箱好友列表的方法:<br><font color="#e78608">------解决方案--------------------</font><br>用抓取页面就可以,标题的话就是title标签之间的,正文是body之间的,用正则去掉一些不需要的内容<div class="clear">
                 
              
              
        
            </div>
Statement:
The content of this article is voluntarily contributed by netizens, and the copyright belongs to the original author. This site does not assume corresponding legal responsibility. If you find any content suspected of plagiarism or infringement, please contact admin@php.cn