Home > Article > Backend Development > How to get and parse XML data using PHP crawler
In web development, obtaining and parsing XML data is a very common operation. This article will focus on how to use a PHP crawler to obtain and parse XML data.
1. Obtain XML data
cURL library is a very commonly used PHP library for obtaining data. You can use the following code to get XML data from a website:
$url = 'http://example.com/example.xml'; $ch = curl_init(); curl_setopt($ch, CURLOPT_URL, $url); curl_setopt($ch, CURLOPT_RETURNTRANSFER, 1); $xml = curl_exec($ch); curl_close($ch);
Here we use curl_init() to initialize a cURL object and set the CURLOPT_URL parameter to the target URL. Setting the CURLOPT_RETURNTRANSFER parameter to 1 will cause cURL to return a string instead of outputting the content directly.
While the cURL library obtains XML data, the file_get_contents() method can also obtain XML data. We can achieve this goal by following the following example:
$url = 'http://example.com/example.xml'; $xml = file_get_contents($url);
2. Parse XML data
PHP provides a variety of methods to parse XML data.
SimpleXML is a very easy-to-use XML parser in PHP. We can use SimpleXML as follows:
$xml = simplexml_load_string($xml);
Here we have used the simplexml_load_string() method to parse the XML string and convert it into an object.
For example, suppose we have the following XML document:
<?xml version="1.0" encoding="UTF-8" ?> <bookstore> <book> <title>PHP 7 Programming Blueprints</title> <author>Vikram Vaswani</author> <price>28.99</price> </book> <book> <title>Mastering PHP 7</title> <author>Chad Russell</author> <price>39.99</price> </book> </bookstore>
We can use the following code to access and output this XML data:
foreach ($xml->book as $book) { echo "Title: " . $book->title . "<br>"; echo "Author: " . $book->author . "<br>"; echo "Price: " . $book->price . "<br>"; }
The output is as follows:
Title: PHP 7 Programming Blueprints Author: Vikram Vaswani Price: 28.99 Title: Mastering PHP 7 Author: Chad Russell Price: 39.99
DOMDocument is another commonly used XML parser in PHP. We can use DOMDocument as follows:
$doc = new DOMDocument(); $doc->loadXML($xml); $books = $doc->getElementsByTagName("book"); foreach ($books as $book) { $titles = $book->getElementsByTagName("title"); $title = $titles->item(0)->nodeValue; $authors = $book->getElementsByTagName("author"); $author = $authors->item(0)->nodeValue; $prices = $book->getElementsByTagName("price"); $price = $prices->item(0)->nodeValue; echo "Title: " . $title . "<br>"; echo "Author: " . $author . "<br>"; echo "Price: " . $price . "<br>"; }
Here we use the DOMDocument class to parse the XML document, and then use the getElementsByTagName() method to obtain specific elements. The final output is the same as the SimpleXML parser.
3. Summary
In this article, we learned how to use PHP crawler to obtain and parse XML data, including using the cURL library and file_get_contents() function to obtain XML data, and using SimpleXML and DOMDocument parse XML data. Hope this article is helpful to you.
The above is the detailed content of How to get and parse XML data using PHP crawler. For more information, please follow other related articles on the PHP Chinese website!