Home  >  Article  >  Backend Development  >  Usage analysis of get_meta_tags(), CURL and user-agent in PHP, curlagent_PHP tutorial

Usage analysis of get_meta_tags(), CURL and user-agent in PHP, curlagent_PHP tutorial

WBOY
WBOYOriginal
2016-07-13 10:11:43955browse

Usage analysis of get_meta_tags(), CURL and user-agent in php, curlagent

This article analyzes the usage of get_meta_tags(), CURL and user-agent in PHP with examples. Share it with everyone for your reference. The specific analysis is as follows:

The get_meta_tags() function is used to grab tags in the form of and load them One-dimensional array, name is the element subscript, content is the element value, the label in the above example can obtain the array: array('A'=>'1', 'b'=>'2'), others< The meta> tag is not processed, and this function only processes the tag, and the subsequent will no longer be processed, but the before will still be processed.

User-agent is part of the invisible header information submitted by the browser when requesting a web page from the server. The header information is an array that contains multiple information, such as local cache directories, cookies, etc., where user-agent is Browser type declaration, such as IE, Chrome, FF, etc.

When I grabbed the tag of a web page today, I always got a null value, but it was normal to directly check the source code of the web page. So I wondered whether the server was set to judge the output based on the header information. Try it first. Use get_meta_tags() to grab a local file, and then the local file writes the obtained header information to the file. The result is as follows, which is replaced with / for easy viewing. The code is as follows:

Copy code The code is as follows:
array (
'HTTP_HOST' => '192.168.30.205',
'PATH' => 'C:/Program Files/Common Files/NetSarang;C:/Program Files/NVIDIA Corporation/PhysX/Common;C:/Program Files/Common Files/Microsoft Shared/Windows Live;C:/Program Files/Intel/iCLS Client/;C:/Windows/system32;C:/Windows;C:/Windows/System32/Wbem;C:/Windows/System32/WindowsPowerShell/v1.0/;C:/Program Files/Intel /Intel(R) Management Engine Components/DAL;C:/Program Files/Intel/Intel(R) Management Engine Components/IPT;C:/Program Files/Intel/OpenCL SDK/2.0/bin/x86;C:/Program Files/Common Files/Thunder Network/KanKan/Codecs;C:/Program Files/QuickTime Alternative/QTSystem;C:/Program Files/Windows Live/Shared;C:/Program Files/QuickTime Alternative/QTSystem/; %JAVA_HOME%/ bin;%JAVA_HOME%/jre/bin;',
'SystemRoot' => 'C:/Windows',
'COMSPEC' => 'C:/Windows/system32/cmd.exe',
'PATHEXT' => '.COM;.EXE;.BAT;.CMD;.VBS;.VBE;.JS;.JSE;.WSF;.WSH;.MSC',
'WINDIR' => 'C:/Windows',
'SERVER_SIGNATURE' => '',
'SERVER_SOFTWARE' => 'Apache/2.2.11 (Win32) PHP/5.2.8',
'SERVER_NAME' => '192.168.30.205',
'SERVER_ADDR' => '192.168.30.205',
'SERVER_PORT' => '80',
'REMOTE_ADDR' => '192.168.30.205',
'DOCUMENT_ROOT' => 'E:/wamp/www',
'SERVER_ADMIN' => 'admin@admin.com',
'SCRIPT_FILENAME' => 'E:/wamp/www/user-agent.php',
'REMOTE_PORT' => '59479',
'GATEWAY_INTERFACE' => 'CGI/1.1',
'SERVER_PROTOCOL' => 'HTTP/1.0',
'REQUEST_METHOD' => 'GET',
'QUERY_STRING' => '',
'REQUEST_URI' => '/user-agent.php',
'SCRIPT_NAME' => '/user-agent.php',
'PHP_SELF' => '/user-agent.php',
'REQUEST_TIME' => 1400747529,
)

Sure enough, there is no HTTP_USER_AGENT element in the array. Apache does not have UA when sending a request to another server. After checking the information, the get_meta_tags() function does not have the ability to forge UA, so we can only use other methods to solve it.

Later, I used CURL to obtain it and obtained the web page, but it was a little more troublesome to use. I first forged the UA, and then used regular expressions to analyze .

Forgery method, the code is as follows:

Copy code The code is as follows:
//Initialize a cURL
$curl = curl_init();

//Set the URL you need to crawl
curl_setopt($curl, CURLOPT_URL, 'http://localhost/user-agent.php');

//Set whether to output the file header to the browser, 0 does not output
curl_setopt($curl, CURLOPT_HEADER, 0);

//Set UA, here the browser's UA is forwarded to the server, you can also manually specify the value
curl_setopt($curl, CURLOPT_USERAGENT, $_SERVER['HTTP_USER_AGENT']);

//Set cURL parameters to require the results to be returned to a string or output to the screen. 0 outputs the screen and returns the BOOL value of the operation result, 1 returns a string
curl_setopt($curl, CURLOPT_RETURNTRANSFER, 1);
// Run cURL to request the web page
$data = curl_exec($curl);

// Close URL request
curl_close($curl);

// Process the obtained data
var_dump($data);

I hope this article will be helpful to everyone’s PHP programming design.

www.bkjia.comtruehttp: //www.bkjia.com/PHPjc/928218.htmlTechArticleAnalysis of get_meta_tags(), CURL and user-agent usage in php, curlagent This article analyzes get_meta_tags() in php with an example , CURL and user-agent usage. Share it with everyone for your reference. Detailed analysis...
Statement:
The content of this article is voluntarily contributed by netizens, and the copyright belongs to the original author. This site does not assume corresponding legal responsibility. If you find any content suspected of plagiarism or infringement, please contact admin@php.cn