Home  >  Article  >  Backend Development  >  How to use URL to get web content in php

How to use URL to get web content in php

小云云
小云云Original
2018-03-27 09:12:584639browse

It is very convenient to obtain web page content based on URL in PHP. You can pass in the system built-in function file_get_contents() and pass in the URL to return the content of the web page. For example, the code to obtain the content of Baidu homepage is:

<?php  
$html = file_get_contents(&#39;http://www.baidu.com/&#39;);  
  
echo $html;

can display the content of Baidu homepage, but this function is not omnipotent, because some servers will disable this function, or this function is rejected by the server because it does not pass certain necessary parameters to the server. For example Example:

<?php  
$html = file_get_contents(&#39;http://www.163.com/&#39;);  
  
echo $html;

This code cannot get the complete code of NetEase's homepage, and will return the following page. At this time, we need to think of other methods.

Here we introduce the cURL library of php, which can easily and effectively capture web pages. You only need to run a script and analyze the web pages you crawled, and then you can get the data you want programmatically. Whether you want to retrieve partial data from a link, take an XML file and import it into a database, or even simply retrieve the content of a web page, cURL is a powerful PHP library. To use it, you must first enable it in the php configuration file. When opening it, you may need some dlls in windows. I don’t believe the introduction here. To check whether curl is enabled, you can call phpinfo(); to check, if it is enabled , it will be displayed in "Loaded Extensions".

The following is a simple example of using curl to obtain web page code:

<?php  
$ch = curl_init();  
$timeout = 10; // set to zero for no timeout  
curl_setopt ($ch, CURLOPT_URL,&#39;http://www.163.com/&#39;);  
curl_setopt ($ch, CURLOPT_RETURNTRANSFER, 1);   
curl_setopt ($ch, CURLOPT_USERAGENT, &#39;Mozilla/5.0 (Windows NT 6.1; WOW64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/34.0.1847.131 Safari/537.36&#39;);  
curl_setopt ($ch, CURLOPT_CONNECTTIMEOUT, $timeout);  
$html = curl_exec($ch);  
echo $html;

Through this code, the content of NetEase's homepage can be output. The code marked in red here is the key, because it simulates The browser's agent is accessed, so that the server will think it is accessed by the browser, so it will return the correct HTML.

Related recommendations:

php obtains the web page based on the URL content

The above is the detailed content of How to use URL to get web content in php. For more information, please follow other related articles on the PHP Chinese website!

Statement:
The content of this article is voluntarily contributed by netizens, and the copyright belongs to the original author. This site does not assume corresponding legal responsibility. If you find any content suspected of plagiarism or infringement, please contact admin@php.cn