Sample code 1: Use file_get_contents to get the content in get mode
Copy the code The code is as follows:
$url='http://www.baidu.com/';
$html=file_get_contents($url);
//print_r($http_response_header);
ec($html);
printhr();
printarr($http_response_header);
printhr();
?>
Example code 2: Open the url with fopen and get Method to obtain content
Copy code The code is as follows:
$fp=fopen($url,' r');
printarr(stream_get_meta_data($fp));
printhr();
while(!feof($fp)){
$result.=fgets($fp,1024) ;
}
echo "url body: $result";
printhr();
fclose($fp);
?>
Sample code 3: Use the file_get_contents function to get the url in post mode
Copy the code The code is as follows:
$data=array('foo'=>'bar');
$data=http_build_query($data);
$opts=array(
'http'=>array(
'method'=>'POST',
'header'=>"Content-type: application/x-www-form-urlencodedrn".
"Content-Length: ".strlen( $data)."rn",
'content'=>$data
),
);
$context=stream_context_create($opts);
$html=file_get_contents(' http://localhost/e/admin/test.html',false,$context);
echo$html;
?>
Sample code 4: Use fsockopen function Open the url and get the complete data in get mode, including header and body
Copy the code The code is as follows:
functionget_url($url,$cookie=false){
$url=parse_url($url);
$query=$url[path]."?".$url[query];
ec("Query:".$query);
$fp=fsockopen($url[host],$url[port]?$url[port]:80,$errno,$errstr,30);
if(!$fp){
returnfalse;
}else{
$request="GET$queryHTTP/1.1rn";
$request.="Host:$url[host] rn";
$request.="Connection: Closern";
if($cookie)$request.="Cookie: $cookien";
$request.="rn";
fwrite ($fp,$request);
while(!@feof($fp)){
$result.=@fgets($fp,1024);
}
fclose($fp) ;
return$result;
}
}
//Get the html part of the url, remove the header
functionGetUrlHTML($url,$cookie=false){
$rowdata=get_url ($url,$cookie);
if($rowdata)
{
$body=stristr($rowdata,"rnrn");
$body=substr($body,4,strlen ($body));
return$body;
}
returnfalse;
}
?>
Sample code 5: Open url with fsockopen function , obtain complete data in POST mode, including header and body
Copy code The code is as follows:
>functionHTTP_Post($URL,$data,$cookie,$referrer=""){
// parsing the given URL
$URL_Info=parse_url($URL);
// Building referrer
if($referrer=="")// if not given use this script. as referrer
$referrer="111";
// making string from $data
foreach ($dataas$key=>$value)
$values[]="$key=".urlencode($value);
$data_string=implode("&",$values);
// Find out which port is needed - if not given use standard (=80)
if(!isset($URL_Info["port"]))
$URL_Info["port"]=80 ;
// building POST-request:
$request.="POST ".$URL_Info["path"]." HTTP/1.1n";
$request.="Host: ".$URL_Info["host"]."n";
$request.="Referer:$referern";
$request.="Content-type: application/x-www-form-urlencodedn" ;
$request.="Content-length: ".strlen($data_string)."n";
$request.="Connection: closen";
$request.="Cookie: $cookien ";
$request.="n";
$request.=$data_string."n";
$fp=fsockopen($URL_Info["host"],$URL_Info[" port"]);
fputs($fp,$request);
while(!feof($fp)){
$result.=fgets($fp,1024);
}
fclose($fp);
return$result;
}
printhr();
?>
Sample code 6: Using the curl library, use Before curling the library, you may need to check php.ini to see if the curl extension has been turned on
Copy the code The code is as follows:
$ch = curl_init();
$timeout = 5;
curl_setopt ($ch, CURLOPT_URL, 'http://www.baidu.com/');
curl_setopt ($ch, CURLOPT_RETURNTRANSFER, 1);
curl_setopt ($ch, CURLOPT_CONNECTTIMEOUT, $timeout);
$file_contents = curl_exec($ch);
curl_close($ch);
echo $ file_contents;
?>
About curl library:
curl official website http://curl.haxx.se/
curl is a file transfer tool using URL syntax, supporting FTP, FTPS, HTTP HTTPS SCP SFTP TFTP TELNET DICT FILE and LDAP. curl supports SSL certificates, HTTP POST, HTTP PUT, FTP uploads, Kerberos, HTTP-based uploads, proxies, cookies, user + password proof, file transfer recovery, http proxy channels and a lot of other useful tricks
Copy code The code is as follows:
functionprintarr(array$arr)
{
echo"
";
foreach($arras$key=>$value)
{
echo"$key=$value < ;br>";
}
}
?>
====================== ================================
The code for PHP to capture remote website data
There may still be Many programming enthusiasts will encounter the same question, that is, how to crawl the HTML code of other people's websites like a search engine, and then collect and organize the code into useful data for themselves! Let me introduce some simple examples today.
Ⅰ. Example of grabbing the title of a remote web page:
The following is the code snippet:
Copy the code The code is as follows:
/*
+--------------------------------- ----------------------------
+ Grab the code for the web page title, directly copy this code snippet and save it as a .php file Just execute.
+--------------------------------------------- ------------------
*/
error_reporting(7);
$file = fopen ("http://www. jb51.net/", "r");
if (!$file) {
echo "Unable to open remote file.n";
exit;
}
while (!feof ($file)) {
$line = fgets ($file, 1024);
if (eregi ("
(.*)< ;/title>", $line, $out)) {
$title = $out[1];
echo "".$title."";
break;
}
}
fclose($file);
//End
?>
Ⅱ. Example of grabbing the HTML code of a remote web page:
The following is the code snippet:
Copy code The code is as follows:
php
/*
+----------------
+DNSing Sprider
+----------------
*/
$fp = fsockopen("www.dnsing.com", 80, $errno, $errstr, 30);
if (!$fp) {
echo "$errstr ($errno)
n";
} else {
$out = "GET / HTTP/1.1rn";
$out .= "Host:www.dnsing.comrn";
$out .= "Connection: Close rnrn";
fputs($fp, $out);
while (!feof($fp)) {
echo fgets($fp, 128);
}
fclose($fp);
}
//End
?>
Just copy the above two code snippets and run them back to see the effect. The above example is just a prototype of grabbing web page data. To make it more suitable for your own use, the situation will be different. So, all program enthusiasts, please take care of yourself. Let’s research it.
==================================
It makes a little sense The functions are: get_content_by_socket(), get_url(), get_content_url(), get_content_object. Several functions may be able to give you some ideas.
//Get all content urls and save them to files
function get_index($save_file, $prefix="index_"){
$count = 68;
$i = 1;
if (file_exists($save_file)) @unlink($save_file);
$fp = fopen($save_file, "a+") or die("Open ". $save_file ." failed ");
while($i<$count){
$url = $prefix . $i .".htm";
echo "Get ". $url ."...";
$url_str = get_content_url(get_url($url));
echo " OKn";
fwrite($fp, $url_str);
++$i;
}
fclose ($fp);
}
//Get the target multimedia object
function get_object($url_file, $save_file, $split="|--:**:--|"){
if (!file_exists($url_file)) die($url_file ." not exist");
$file_arr = file($url_file);
if (!is_array($file_arr) || empty( $file_arr)) die($url_file ." not content");
$url_arr = array_unique($file_arr);
if (file_exists($save_file)) @unlink($save_file);
$fp = fopen($save_file, "a+") or die("Open save file ". $save_file ." failed");
foreach($url_arr as $url){
if (empty($url)) continue;
echo "Get ". $url ."...";
$html_str = get_url($url);
echo $html_str;
echo $url;
exit;
$obj_str = get_content_object($html_str);
echo " OKn";
fwrite($fp, $obj_str);
}
fclose($fp);
}
//Traverse the directory to get the file content
function get_dir($save_file, $dir){
$dp = opendir($dir);
if (file_exists($save_file)) @unlink ($save_file);
$fp = fopen($save_file, "a+") or die("Open save file ". $save_file ." failed");
while(($file = readdir($dp )) != false){
if ($file!="." && $file!=".."){
echo "Read file ". $file ."...";
$file_content = file_get_contents($dir . $file);
$obj_str = get_content_object($file_content);
echo " OKn";
fwrite($fp, $obj_str);
}
}
fclose($fp);
}
//Get the content of the specified url
function get_url($url){
$reg = '/^ http://[^/].+$/';
if (!preg_match($reg, $url)) die($url ." invalid");
$fp = fopen($url, "r") or die("Open url: ". $url ." failed.");
while($fc = fread($fp, 8192)){
$content .= $fc;
}
fclose($fp);
if (empty($content)){
die("Get url: ". $url ." content failed.");
}
return $content;
}
//Use socket to get the specified web page
function get_content_by_socket($url, $host){
$fp = fsockopen($host, 80) or die("Open ". $url ." failed");
$header = "GET /".$url ." HTTP/1.1rn";
$header .= "Accept: */*rn" ;
$header .= "Accept-Language: zh-cnrn";
$header .= "Accept-Encoding: gzip, deflatern";
$header .= "User-Agent: Mozilla/4.0 (compatible; MSIE 6.0; Windows NT 5.1; SV1; Maxthon; InfoPath.1; .NET CLR 2.0.50727)rn";
$header .= "Host: ". $host ."rn";
$header .= "Connection: Keep-Alivern";
//$header .= "Cookie: cnzz02=2; rtime=1; ltime=1148456424859; cnzz_eid=56601755-rnrn";
$header .= "Connection: Closernrn";
fwrite($fp, $header);
while (!feof($fp)) {
$contents .= fgets($fp, 8192);
}
fclose($fp);
return $contents;
}
//Get the url in the specified content
function get_content_url($host_url, $ file_contents){
//$reg = '/^(#|javascript.*?|ftp://.+|http://.+|.*?href.*?|play.* ?|index.*?|.*?asp)+$/i';
//$reg = '/^(down.*?.html|d+_d+.htm.*?)$/i' ;
$rex = "/([hH][rR][eE][Ff])s*=s*['"]*([^>'"s]+)["'>] *s*/i";
$reg = '/^(down.*?.html)$/i';
preg_match_all ($rex, $file_contents, $r);
$result = ""; //array();
foreach($r as $c){
if (is_array($c)){
foreach($c as $d){
if ( preg_match($reg, $d)){ $result .= $host_url . $d."n"; }
}
}
}
return $result;
}
//Get the multimedia files in the specified content
function get_content_object($str, $split="|--:**:--|"){
$regx = "/hrefs*= s*['"]*([^>'"s]+)["'>]*s*(
.*?)/i";
preg_match_all( $regx, $str, $result);
if (count($result) == 3){
$result[2] = str_replace("
Multimedia: ", "" , $result[2]);
$result[2] = str_replace("", "", $result[2]);
$result = $result[1][0 ] . $split .$result[2][0] . "n";
}
return $result;
}
?>
== ================================================== ==
When the same domain name corresponds to multiple IPs, PHP's function to obtain the content of the remote webpage
fgc simply reads it and encapsulates all operations
fopen also performs some Encapsulated, but you need to read all the data in a loop.
fsockopen This is a straight-line socket operation.
If you just read an html page, fgc is better.
If the company accesses the Internet through a firewall, the general file_get_content function will not work. Of course, it is also possible to directly write http requests to the proxy through some socket operations, but it is more troublesome.
If you can confirm that the file is small, you can choose any of the above two methods fopen,join('',file($file));. For example, if you only operate files smaller than 1k, it is best to use file_get_contents.
If you are sure that the file is large, or the size of the file cannot be determined, it is best to use file streams. There is no obvious difference between fopening a 1K file and fopening a 1G file. The longer the content, the longer it takes to read, rather than letting the script die.
-------------------------------------------------- --------
http://www.phpcake.cn/archives/tag/fsockopen
PHP has many ways to obtain remote web content, such as using its own functions such as file_get_contents and fopen.
echo file_get_contents("http://img.jb51.net/abc.php");
?>
However, in the DNS round In load balancing such as query, the same domain name may correspond to multiple servers and multiple IPs. Assume that img.jb51.net is resolved to three IPs: 72.249.146.213, 72.249.146.214, and 72.249.146.215 by DNS. Every time a user accesses img.jb51.net, the system will access one of the servers based on the corresponding load balancing algorithm.
When I was working on a video project last week, I encountered such a requirement: I needed to access a PHP interface program (assumed to be abc.php) on each server in order to query the transmission status of this server.
At this time, you cannot directly use file_get_contents to access http://img.jb51.net/abc.php, because it may keep accessing a certain server repeatedly.
And use the method of visiting http://72.249.146.213/abc.php, http://72.249.146.214/abc.php, http://72.249.146.215/abc.php in sequence, here It is also not possible when the Web Server on three servers is equipped with multiple virtual hosts.
It is not possible to set local hosts, because hosts cannot set multiple IPs corresponding to the same domain name.
Then it can only be achieved through PHP and HTTP protocols: when accessing abc.php, add the img.jb51.net domain name to the header. So, I wrote the following PHP function:
Copy code The code is as follows:
/************************
* Function usage: When the same domain name corresponds to multiple IPs, obtain the remote web page content of the specified server
* Creation time :2008-12-09
* Created by: Zhang Yan (img.jb51.net)
* Parameter description:
* $ip The IP address of the server
* $host The host name of the server
* $url URL address of the server (excluding domain name)
* Return value:
* Obtained remote web page content
* false Failed to access remote web page
******* *****************/
function HttpVisit($ip, $host, $url)
{
$errstr = '';
$errno = '';
$fp = fsockopen ($ip, 80, $errno, $errstr, 90);
if (!$fp)
{
return false;
}
else
{
$out = "GET {$url} HTTP/1.1rn";
$out .= "Host:{$host}rn";
$out .= "Connection: closernrn";
fputs ($fp, $out);
while($line = fread($fp, 4096)){
$response .= $line;
}
fclose( $ fp );
//Remove Header information
$pos = strpos($response, "rnrn");
$response = substr($response, $pos + 4);
return $response;
}
}
//Calling method:
$server_info1 = HttpVisit("72.249.146.213", "img.jb51.net", " /abc.php");
$server_info2 = HttpVisit("72.249.146.214", "img.jb51.net", "/abc.php");
$server_info3 = HttpVisit("72.249.146.215" , "img.jb51.net", "/abc.php");
?>
http://www.bkjia.com/PHPjc/320696.htmlwww.bkjia.comtruehttp: //www.bkjia.com/PHPjc/320696.htmlTechArticleSample code 1: Use file_get_contents to get the content in get mode. Copy the code as follows: ?php $url='http: //www.baidu.com/'; $html=file_get_contents($url); //print_r($http_response_he...