Home > Article > Backend Development > What are the ways to get page content in php
Several methods for PHP to obtain web page content
Method 1: Use file_get_contents to obtain the content in get mode.
<?php $url='http://www.domain.com/?para=123'; $html= file_get_contents($url); echo$html; ?>
Method 2: Use the file_get_contents function to get the url in post mode.
<?php $url= 'http://www.domain.com/test.php?id=123'; $data= array('foo'=> 'bar'); $data= http_build_query($data); $opts= array( 'http'=> array( 'method'=> 'POST', 'header'=>"Content-type: application/x-www-form-urlencoded\r\n" . "Content-Length: " . strlen($data) . "\r\n", 'content'=> $data ) ); $ctx= stream_context_create($opts); $html= @file_get_contents($url,'',$ctx);
If you need to pass cookie data again, just change
'header'=>"Content-type: application/x-www-form-urlencoded\r\n" . "Content-Length: " . strlen($data) . "\r\n",
to
'header'=>"Content-type: application/x-www-form-urlencoded\r\n" . "Content-Length: " .strlen($data) . "\r\n". "cookie:cookie1=c1;cookie2=c2\r\n";
.
Method 3: Use fopen to open the url and get the content in get mode.
<?php $fp= fopen($url,'r'); $header= stream_get_meta_data($fp);//获取报头信息 while(!feof($fp)) { $result.= fgets($fp, 1024); } echo"url header: {$header} <br>": echo"url body: $result"; fclose($fp); ?>
Related recommendations: "PHP Getting Started Tutorial"
Method 4: Use fopen to open the url and obtain the content in post mode.
<?php $data= array('foo2'=> 'bar2','foo3'=>'bar3'); $data= http_build_query($data); $opts= array( 'http'=> array( 'method'=> 'POST', 'header'=>"Content-type: application/x-www-form-urlencoded\r\nCookie:cook1=c3;cook2=c4\r\n" . "Content-Length: " . strlen($data) . "\r\n", 'content'=> $data ) ); $context= stream_context_create($opts); $html= fopen('http://www.test.com/zzzz.php?id=i3&id2=i4','rb',false, $context); $w=fread($html,1024); echo$w; ?>
Method 5: Use the fsockopen function to open the url and obtain the complete data in get mode, including header and body.
<?php functionget_url ($url,$cookie=false) { $url= parse_url($url); $query= $url[path]."?".$url[query]; echo"Query:".$query; $fp= fsockopen($url[host],$url[port]?$url[port]:80 , $errno,$errstr, 30); if(!$fp) { returnfalse; }else{ $request= "GET $query HTTP/1.1\r\n"; $request.= "Host: $url[host]\r\n"; $request.= "Connection: Close\r\n"; if($cookie)$request.="Cookie: $cookie\n"; $request.="\r\n"; fwrite($fp,$request); while(!@feof($fp)) { $result.= @fgets($fp, 1024); } fclose($fp); return$result; } } //获取url的html部分,去掉header functionGetUrlHTML($url,$cookie=false) { $rowdata= get_url($url,$cookie); if($rowdata) { $body=stristr($rowdata,"\r\n\r\n"); $body=substr($body,4,strlen($body)); return$body; } returnfalse; } ?>
Method 6: Use the fsockopen function to open the url and obtain the complete data in POST mode, including header and body.
<?php functionHTTP_Post($URL,$data,$cookie,$referrer="") { // parsing the given URL $URL_Info=parse_url($URL); // Building referrer if($referrer=="")// if not given use this script as referrer $referrer="111"; // making string from $data foreach($dataas $key=>$value) $values[]="$key=".urlencode($value); $data_string=implode("&",$values); // Find out which port is needed - if not given use standard (=80) if(!isset($URL_Info["port"])) $URL_Info["port"]=80; // building POST-request: $request.="POST ".$URL_Info["path"]." HTTP/1.1\n"; $request.="Host: ".$URL_Info["host"]."\n"; $request.="Referer: $referer\n"; $request.="Content-type: application/x-www-form-urlencoded\n"; $request.="Content-length: ".strlen($data_string)."\n"; $request.="Connection: close\n"; $request.="Cookie: $cookie\n"; $request.="\n"; $request.=$data_string."\n"; $fp= fsockopen($URL_Info["host"],$URL_Info["port"]); fputs($fp,$request); while(!feof($fp)) { $result.= fgets($fp, 1024); } fclose($fp); return$result; } ?>
Method 7: Use the curl library. Before using the curl library, you may need to check whether the curl extension has been turned on in php.ini.
<?php $ch= curl_init(); $timeout= 5; curl_setopt ($ch, CURLOPT_URL, 'http://www.domain.com/'); curl_setopt ($ch, CURLOPT_RETURNTRANSFER, 1); curl_setopt ($ch, CURLOPT_CONNECTTIMEOUT, $timeout); $file_contents= curl_exec($ch); curl_close($ch); echo$file_contents; ?>
Here are collected 3 methods of using php to obtain web page source code and crawl web content, which we can choose according to actual needs.
1. Use file_get_contents to obtain the web page source code
This method is the most commonly used and only requires two lines of code. It is very simple and convenient.
Reference code:
<?php $fh= file_get_contents('http://www.webkaka.com/'); echo $fh; ?>
2. Use fopen to obtain the web page source code
This method is used by many people, but there is a lot of code.
Reference code:
<?php $fh = fopen('http://www.webkaka.com/', 'r'); if($fh){ while(!feof($fh)) { echo fgets($fh); } } ?>
3. Use curl to obtain the source code of the web page
The method of using curl to obtain the source code of the web page is often used by people with higher requirements, such as When you need to crawl the web page content, you can also get the web page header information, the use of ENCODING encoding, the use of USERAGENT, etc.
Reference code one:
<?php // 创建一个新cURL资源 $ch = curl_init(); // 设置URL和相应的选项 curl_setopt($ch, CURLOPT_URL, "http://www.webkaka.com/"); curl_setopt($ch, CURLOPT_HEADER, false); // 抓取URL并把它传递给浏览器 data=curlexec(ch); echo $data; //关闭cURL资源,并且释放系统资源 curl_close($ch); ?>
Reference code two:
<?php $szUrl = "http://www.webkaka.com/"; $UserAgent = 'Mozilla/4.0 (compatible; MSIE 7.0; Windows NT 6.0; SLCC1; .NET CLR 2.0.50727; .NET CLR 3.0.04506; .NET CLR 3.5.21022; .NET CLR 1.0.3705; .NET CLR 1.1.4322)'; $curl = curl_init(); curl_setopt(curl,CURLOPTURL,szUrl); curl_setopt($curl, CURLOPT_HEADER, 0); //0表示不输出Header,1表示输出 curl_setopt($curl, CURLOPT_RETURNTRANSFER, 1); curl_setopt($curl, CURLOPT_SSL_VERIFYPEER, false); curl_setopt($curl, CURLOPT_SSL_VERIFYHOST, false); curl_setopt($curl, CURLOPT_ENCODING, ''); curl_setopt(curl,CURLOPTUSERAGENT,UserAgent); curl_setopt($curl, CURLOPT_FOLLOWLOCATION, 1); data=curlexec(curl); echo $data; //echo curl_errno($curl); //返回0时表示程序执行成功 如何从curl_errno返回值获取错误信息
The above is the detailed content of What are the ways to get page content in php. For more information, please follow other related articles on the PHP Chinese website!