Home >php教程 >php手册 >PHP 采集获取指定网址的内容

PHP 采集获取指定网址的内容

WBOY
WBOYOriginal
2016-06-13 12:19:461039browse

参考别人想法变成自己的想法,你会发现慢慢下来以后你就拥有了临时解决很多问题的思路与方法。

复制代码 代码如下:


/*
功能:获取页面内容,存储下来阅读; lost63
*/
Class GetUrl{
var $url; //地址
var $result; //结果
var $content; //内容
var $list; //列表
function GetUrl($url){
$this->url=$url;
$this->GetContent();
$this->GetList();
$this->FileSave();
//print_r($this->list[2]);
}
private function GetContent(){
$this->result=fopen($this->url,"r");
while(!feof($this->result)){
$this->content.=fgets($this->result,9999);
}
}
private function GetList(){
preg_match_all('/(.*?)/',$this->content,$this->list);
$this->list[2]=array_unique($this->list[2]); //移除相同的值
while(list($key,$value)=each($this->list[2])){
if(strpos($value,".html")==0||strpos($value,"jiaocheng")==0){
unset($this->list[2][$key]);
}else{
$this->list[2][$key]=substr($value,0,strpos($value,".html")).".html"; //去掉不需要的标签
}
}
}
private function FileSave(){
foreach($this->list[2] as $value){
$this->url=$value; //重新赋值
$this->content=null;
$this->GetContent(); //提取内容
preg_match_all('/(.*?)/',$this->content,$files); //取标题 <br>$filename=$files[1][0].".html"; //存储名 <br>$content=$this->str_cut($this->content,'http://pagead2.googlesyndication.com/pagead/show_ads.js','<div id="article_detail">'); <br>$file=fopen($filename,"w"); <br>fwrite($file,$content); <br>fclose($file); <br>echo $filename."保存 OK<br>\n"; <br>} <br>} <br>function str_cut($str ,$start, $end) { <br>$content = strstr( $str, $start ); <br>$content = substr( $content, strlen( $start ), strpos( $content, $end ) - strlen( $start ) ); <br>return $content; <br>} <br>} <br>$w=new GetUrl("http://www.ijavascript.cn/jiaocheng/javascript-jiaocheng-352.html"); <br>?><br> </div>

Statement:
The content of this article is voluntarily contributed by netizens, and the copyright belongs to the original author. This site does not assume corresponding legal responsibility. If you find any content suspected of plagiarism or infringement, please contact admin@php.cn