Copy code The code is as follows:
//Collect html
function getwebcontent($url){
$ch = curl_init();
$timeout = 10;
curl_setopt($ch, CURLOPT_URL, $url) ;
curl_setopt($ch, CURLOPT_RETURNTRANSFER, 1);
curl_setopt($ch, CURLOPT_CONNECTTIMEOUT, $timeout);
curl_setopt ($ch, CURLOPT_FOLLOWLOCATION, 1);
$contents = trim(curl_exec($ch));
curl _close ($ch);
return $contents;
}
//Get title and url
$string =
getwebcontent('http://www.***.com/learn/zhunbeihuaiyun/jijibeiyun/2');
//Regular match
Get title and address
preg_match_all ("/(.*)/" ,$string, $out, PREG_SET_ORDER);
foreach($out as $key => $value){
$article['title'][] = $out[$key][2];
$article[ 'link'][] = "http://www.***.com/learn/article/".$out[$key][1];
}
//Get the article content based on the url
foreach($ article['link'] as $key=>$value){
$content_html = getwebcontent($article['link'][$key]);
preg_match("/[s|S]*?
/",$content_html,$matches);
$article[content][$key] = $matches[0];
}
//No transcoding It really can’t be saved as a file
foreach($article[title] as $key=>$value){
$article[title][$key] = iconv('utf-8', 'gbk', $value); //Transcode
}
//Save to file
$num = count($article['title']);
for($i=0; $i<$num; $i++){
file_put_contents("{ $article[title][$i]}.txt", $article['content'][$i]);
}
?>
The above introduces the regular code for collecting PHP articles, including the content of collection. I hope it will be helpful to friends who are interested in PHP tutorials.
Statement:The content of this article is voluntarily contributed by netizens, and the copyright belongs to the original author. This site does not assume corresponding legal responsibility. If you find any content suspected of plagiarism or infringement, please contact admin@php.cn