Home >Backend Development >PHP Tutorial >How to use phpQuery to collect web pages_PHP tutorial

How to use phpQuery to collect web pages_PHP tutorial

WBOY
WBOYOriginal
2016-07-13 10:25:111300browse

phpQuery is a server-side open source project based on PHP. It allows PHP developers to easily process the content of DOM documents, such as obtaining the headline information of a news website. What's more interesting is that it uses the idea of ​​​​jQuery. You can process the page content just like using jQuery to get the page information you want.
Collecting headlines
Let’s look at an example first. Now I want to collect the headlines of domestic news from Sina. The code is as follows:

Copy the code The code is as follows:

include 'phpQuery/phpQuery.php';
phpQuery::newDocumentFile('http://www.jb51.net');
echo pq(".blkTop h1: eq(0)")->html();

With three simple lines of code, you can get the headline content. First, include the phpQuery.php core program in the program, then call it to read the target web page, and finally output the content under the corresponding tag.
pq() is a powerful method, just like jQuery's $(). jQuery's selector can basically be used on phpQuery, just change "." to "->". As in the above example, pq (".blkTop h1:eq(0)") captures the DIV element with the page class attribute blkTop, finds the first h1 tag inside the DIV, and then uses the html() method to obtain the h1 tag. The content (with html tags) is the headline information we want to get. If we use the text() method, we will only get the text content of the headline. Of course, to use phpQuery well, the key is to find the node with the corresponding content in the document.
Collect article list
Let’s look at another example to get the blog list of helloweba.com website. Please see the code:
Copy the code The code is as follows :

include 'phpQuery/phpQuery.php';
phpQuery::newDocumentFile('http://www.jb51.net');
$artlist = pq(". blog_li");
foreach($artlist as $li){
echo pq($li)->find('h2')->html()."";
}

By looping through the DIVs in the list, find the article title and output it, it’s that simple.
Parse XML document
Suppose there is now a test.xml document like this:
Copy the code The code is as follows:




Zhang San
22


王五
18



Now I want to get the age of the contact named Zhang San, the code is as follows:
Copy code The code is as follows:

include 'phpQuery/phpQuery.php';
phpQuery::newDocumentFile('test.xml');
echo pq('contact > age:eq(0)');
Result output: 22

Like jQuery, accurately search the document node and output the content under the node. Parsing an XML document is that simple. Now you don't have to use those cumbersome codes such as regular algorithms and content replacement to collect website content. With phpQuery, everything becomes much easier.
phpquery project official website address: http://code.google.com/p/phpquery/

www.bkjia.comtruehttp: //www.bkjia.com/PHPjc/825181.htmlTechArticlephpQuery is a server-side open source project based on PHP, which allows PHP developers to easily process DOM document content, such as Get the headline information of a news website. What’s more interesting is that it uses...
Statement:
The content of this article is voluntarily contributed by netizens, and the copyright belongs to the original author. This site does not assume corresponding legal responsibility. If you find any content suspected of plagiarism or infringement, please contact admin@php.cn