Home >Backend Development >PHP Tutorial >Parsing XML with PHP toolkit expat_PHP tutorial
Everyone these days is touting XML as a Web developer's best friend, making it easy to format and display data from almost any data source. However, well-formatted data is far from ideal for dynamic content. Most web developers will tell you how the web today can work without dynamic content! The question is: "How to use XML to create dynamic content?"
The answer is to use a dynamic content processing language to parse XML, such as PHP or Perl. In theory, this kind of programming language can be used for various purposes. Use XML for various purposes. It's nothing more than using some toolkits that can parse XML. James Clark provides a toolkit called expat. The expat XML toolkit uses C language to parse XML, making it easy for PHP and XML to dance together.
PHP is a great scripting language designed specifically for the web. XML is a standard for representing web content. How beautiful it would be if the two joined forces!
Below I will show readers a simple example, which shows how to use PHP to parse XML documents into HTML. Then I'll introduce some other XML concepts of PHP. Parsing XML with PHP is simple and intuitive but requires some explanation of the details. Once you really get the hang of it, you'll be surprised why you didn't think of combining them earlier.
Overview
PHP uses expat, an XML toolkit, to parse XML through C language. The function set of this toolkit is the same as that used by Perl XML parsing. In addition, this toolkit is an event-driven parser. That is to say, expat treats each XML tag or new line of code as the start of an event, and the event is the trigger of the function. Installing Expat is very simple, and if you are using the Apache web server, you can find installation and download instructions on the PHP XML reference page.
The basic task of parsing XML with PHP is this: First, create an instance of the XML parser. Next, define functions that handle trigger events, such as start or end tags. Subsequently, define the actual data processing procedures. Finally, the XML file is opened, the file data is read and the data is parsed. Afterwards closing the file releases the XML parser.
Look, like I said, there’s nothing special about this process. However, before we discuss specific examples, here are some caveats:
Expat does not perform XML validation. This means that as long as the XML file is well-formed - all elements are nested properly, opening and closing tags have no errors - it will be parsed. Expat does not care whether the XML conforms to the standards or definitions referenced in the XML file header.
Expat converts all XML tags to uppercase letters. Be careful if your script mixes uppercase and lowercase letters in tag names and other content.
PHP is compiled with the magic quotes setting enabled, so complex XML files will not be parsed correctly. If magic quotes is not the default setting, just pretend I didn't say it.
Okay, let’s take a look at the relevant examples now!
Basic Example
In order to simplify complex things, I have omitted error checking and other unnecessary things in the example. Of course, you can do whatever you want in your own code. I assume that you are already familiar with PHP and its syntax, and I will explain the XML functions. First I'll explain what a script is, and then I'll define user-defined functions that actually precede the code that references them. Related Attachment: Program Listing A shows the complete code of the script, and the XML document to be parsed by the script is Related Attachment: Program Listing B. The output results after processing are shown in Table A.
XML Articles
"Remedial XML for programmers: Basic syntax" In this first installment in a three-part series, I'll introduce you to XML and its basic syntax.
"Remedial XML: Enforcing document formats with DTDs" To enforce structure requirements for an XML document, you have to turn to one of XML's attendant technologies, data type definition (DTD).
"Remedial XML: Using XML Schema" In this article, we'll briefly touch on the shortcomings of DTDs and discuss the basics of a newer, more powerful standard: XML Schemas.
"Remedial XML: Say hello to DOM" Now it's time to put on your programmer's hat and get acquainted with Document Object Model (DOM ), which provides easy access to XML documents via a tree-like set of objects.
"Remedial XML: Learning to play SAX" In this fifth installment in our Remedial XML series, I'll introduce you to the SAX API and provide some links to SAX implementations in several languages.
Table A Output results of PHP parsing XML
First I created an instance of the XML parser:
$parser = xml_parser_create();
Then, I define what the parser does when it encounters opening and closing tags.Note that "startElement" and "endElement" are user-defined functions. Of course, you can give them other names according to your own preferences, but the names I gave are standard conventions.
xml_set_element_handler($parser, “startElement”, “endElement”);
Then I defined the data operation. The "characterData" here is also a user-defined function, and the name is also customary.
xml_set_character_data_handler($parser, “characterData”);
Now open the file to read the data. You can start writing error handling code here, I've omitted these error handling in the example. Don't forget to define $xml_file at the beginning of the script.
$filehandler = fopen($xml_file, "r");
I start reading the file content, 4K bytes at a time and put them in the variable "$data" until the end of the file. I use xml_parse to parse these data segments I read.
while ($data = fread($filehandler, 4096)) {
xml_parse($parser, $data, feof($filehandler));
}
Finally clear, close the file and release parser and other operations.
fclose($filehandler);
xml_parser_free($parser);
The above are all the XML functions used in the script. Let me explain in detail the 3 user-defined functions used in it. They are "startElement", "endElement" and "characterData".
Whenever xml_parse encounters a start tag like
For this example, I decided to display my XML data in an HTML table. As shown above, I did not write error handling code for the sake of simplicity. I'm pulling a trick here because I know the order in which the tags appear in the XML file. Otherwise I could define the array with the "startElement", "characterData" and "endElement" functions and then display the results with a separate function.
function startElement($parser_instance, $element_name, $attrs) {
switch($element_name) {
case “URL” : echo “