Home >Backend Development >PHP Tutorial > 采集网站内容,停止在多少行!怎么操作

采集网站内容,停止在多少行!怎么操作

WBOY
WBOYOriginal
2016-06-13 12:46:36834browse

采集网站内容,停止在多少行!如何操作!

<br />
<br />
function get_content_by_socket($url){ <br />
$url = eregi_replace('^http://', '', $url);<br />
$temp = explode('/', $url);<br />
$host = array_shift($temp);<br />
$url = ''.implode('/', $temp);<br />
$temp = explode(':', $host);<br />
$host = $temp[0];<br />
$port = isset($temp[1]) ? $temp[1] : 80;<br />
//echo $url;<br />
//echo $host;<br />
   $fp = fsockopen($host, 80) or die("Open ". $url ." failed"); <br />
    $header = "GET /".$url ." HTTP/1.1\r\n"; <br />
    $header .= "Accept: */*\r\n"; <br />
    $header .= "Accept-Language: zh-cn\r\n"; <br />
   $header .= "Accept-Encoding: gzip, deflate\r\n"; <br />
   $header .= "If-Modified-Since: Tue, 06 Apr 2010 07:56:03 GMT; length=2235\r\n"; <br />
    $header .= "User-Agent: Mozilla/4.0 (compatible; MSIE 6.0; Windows NT 5.1; SV1; GTB6.4)\r\n"; <br />
    $header .= "Host: ". $host ."\r\n"; <br />
	$header .= "Referer: http://video.baidu.com/v?word=11&ct=301989888&rn=20&pn=0&db=0&s=0&fbl=800\r\n"; <br />
	//fputs($content, "Referer: $domainrn");//伪造部分 <br />
    $header .= "Connection: Keep-Alive\r\n"; <br />
    $header .= "Cookie: BAIDUID=5F96971273579588527A980F307E8B7A:FG=1\r\n\r\n"; <br />
    //$header .= "Connection: Close\r\n\r\n"; <br />
<br />
    fwrite($fp, $header); <br />
    while (!feof($fp)) { <br />
        $contents .= fgets($fp, 8192); <br />
    } <br />
    fclose($fp); <br />
    return $contents; <br />
} <br />


以这个函数与为例

只要读取到第10行,下面的就不取了直接结束输出内容!这样取到了自己想要的,节省时间资源!
或者只读取到

 自定义哪个字段
可实现吗
Statement:
The content of this article is voluntarily contributed by netizens, and the copyright belongs to the original author. This site does not assume corresponding legal responsibility. If you find any content suspected of plagiarism or infringement, please contact admin@php.cn