Home  >  Article  >  Backend Development  >  XML application in PHP_PHP tutorial

XML application in PHP_PHP tutorial

WBOY
WBOYOriginal
2016-07-13 17:39:56805browse

Overview
XML stands for Extensible Markup Language (abbreviation of eXtensible Markup Language, meaning extensible markup language). XML is a set of rules that define semantic markup that divides a document into parts and identifies those parts. It is also a meta-markup language, that is, it defines a syntactic language for defining other domain-specific, semantic, and structured markup languages. XML is the hottest technology today. PHP also has the function of analyzing XML documents. Below we will discuss the application of XML in PHP.

  Overview of XML
Talking about XML (eXtended Markup Language: Extensible Markup Language), we might as well look at a piece of HTML code first:
 <html>
 < title>XML</title>
 <body>
 <p><center><font color="red">TEXT</font></center></p>
 <a href= "www.domain.com"><img src="logo.jpg"/></a>
 </body>
 </html>
  The above code is structurally consistent with XML According to the rules, XML can be understood as a tree structure type containing data:
1. When referencing the same element, use consistent case, such as

, which does not comply with the regulations
2. Any attribute value (such as href="????") must be quoted with "". For example, is incorrect
3. All elements must be opened by markup, the element should be in the shape of or an empty element XML application in PHP_PHP tutorial. If the "/>" at the end is missing "/", it is an error code
 4. All elements They must be nested into each other, just like the loops in writing programs, and all elements must be nested in the root element. For example, all the contents of the above code are nested in .
5. The element name (i.e. body a p img above, etc.) should start with a letter.

 How to use PHP’s XML parser Expat?
Expat is an XML parser (also called an XML processor) for the PHP scripting language, which allows programs to access the structure and content of XML documents. It is an event-based parser. There are two basic types of XML parsers:
Tree-based parsers: convert XML documents into tree structures. This type of parser parses the entire article while providing an API to access each element of the resulting tree. Its common standard is DOM (Document Object Model).
Event-based parser: Treat XML documents as a series of events. When a special event occurs, the parser will call the function provided by the developer to handle it. The event-based parser has a data-focused view of the XML document, which means that it focuses on the data portion of the XML document rather than its structure. These parsers process the document from beginning to end and report events like - start of element, end of element, start of feature data, etc. - to the application through callback functions.
The following is an example XML document for "Hello-World":

Hello World

The event-based parser will report as three events:
Start element: greeting
Start of CDATA item with value: Hello World
End element: greeting
The event-based parser does not generate a structure that describes the document. Of course, if you use Expat, it will be the same if necessary. Complete native tree structures can be generated in PHP. In CDATA items, the event-based parser does not get the greeting information of the parent element. However, it provides a lower level access, which allows for better utilization of resources and faster access. This way, there is no need to fit the entire document into memory; in fact, the entire document can even be larger than the actual memory value.

Although the above Hello-World example includes a complete XML format, it is invalid because there is neither a DTD (Document Type Definition) associated with it, nor an embedded DTD. But Expat is a parser that does not check validity and therefore ignores any DTD associated with the document. It should be noted that the document still needs to be fully formed, otherwise Expat (like other XML-compliant parsers) will stop with an error message.

  Compiling Expat
Expat can be compiled into PHP3.0.6 version (or above). Starting from Apache 1.3.22, Expat has been included as part of Apache. On Unix systems, PHP can be compiled into PHP by configuring it with the -with-xml option.
If PHP is compiled as an Apache module, Expat will be part of Apache by default. In Windows, the XML dynamic link library must be loaded.
XML Example: XMLstats
The example we are going to discuss is using Expat to collect statistics on XML documents.
For each element in the document, the following information will be output:
* The number of times the element is used in the document
* The number of character data in the element
* The parent element of the element
* Child elements of the element
Note: For demonstration, we use PHP to generate a structure to save the parent element and child element of the element

What are the functions used to generate XML parser instances?
The function used to generate an XML parser instance is xml_parser_create(). This instance will be used for all future functions. This idea is very similar to the connection tag of the MySQL function in PHP. Before parsing a document, event-based parsers usually require the registration of a callback function - to be called when a specific event occurs. Expat has no exception events. It defines the following seven possible events:

object EndCharacter data xml_set_character_data_handler( ) The beginning of character data
External entity xml_set_external_entity_ref_handler() An external entity appears
Unresolved external entity handler() handles the occurrence of instructions
notation statement xml_set_notation_decl_handler() The occurrence of notation declaration
Default xml_set_default_handler() Other events that do not specify a handler ).
Regarding the sample script at the end of this article, it should be noted that it uses both element processing functions and character data processing functions. The element's callback handler function is registered through xml_set_element_handler().
This function requires three parameters:
An instance of the parser
The name of the callback function that processes the starting element
The name of the callback function that processes the ending element
When starting to parse the XML document, the callback The function must exist. They must be defined consistent with the prototypes described in the PHP manual.
For example, Expat passes three parameters to the handler function of the starting element. In the script example, it is defined as follows:
Function start_element($parser, $name, $attrs)
$parser is the parser flag, $name is the name of the starting element, $attrs contains all the attributes of the element and Array of values.
Once it starts parsing the XML document, Expat will call the start_element() function and pass the parameters whenever it encounters the start element.

 
XML Case Folding option

Use the xml_parser_set_option() function to turn off the Case folding option. This option is on by default, causing element names passed to handler functions to be automatically converted to uppercase. But XML is case sensitive (so case is very important for statistical XML documents). For our example, the case folding option must be turned off.

 How to parse the document?
After completing all the preparation work, now the script can finally parse the XML document:
Xml_parse_from_file(), a custom function, Open the file specified in the parameter and parse it with a size of 4kb
xml_parse(), like xml_parse_from_file(), will return false when an error occurs, that is, the format of the XML document is incomplete. We can use the xml_get_error_code() function to get the numeric code of the last error. Pass this numeric code to the xml_error_string() function to get the error text message. Outputs the current line number of XML, making debugging easier. When parsing a document, the question that needs to be emphasized for Expat is: How to maintain a basic description of the document structure?
As mentioned before, the event-based parser itself does not produce any structural information. However, the tag structure is an important feature of XML. For example, the element sequence means something different than <figure><title>. The title of the book and the title of the picture are not related, although they both use the term "title". Therefore, in order to use event-based parsers to process XML more efficiently, you must use your own stacks or lists to maintain the structural information of the document.<br> In order to mirror the document structure, the script needs to know at least the parent element of the current element. This is not possible with Exapt's API. It only reports events of the current element without any contextual information. Therefore, you need to build your own stack structure. <br> The script example uses a first-in, last-out (FILO) stack structure. Through an array, the stack will save all starting elements. For the start element processing function, the current element will be pushed to the top of the stack by the array_push() function. Correspondingly, the end element processing function removes the top element through array_pop(). <br> For the sequence <book><title>, the stack is filled as follows:
Start element book: assign "book" to the first element of the stack ($stack[0] ).
Starting element title: Assign "title" to the top of the stack ($stack[1]).
End element title: Remove the top element from the stack ($stack[1]).
End element title: remove the last from the stack

www.bkjia.comtruehttp: //www.bkjia.com/PHPjc/486244.htmlTechArticleOverview XML stands for Extensible Markup Language (the abbreviation of eXtensible Markup Language, meaning extensible markup language). XML is a set of rules that define semantic markup that documents...
Statement:
The content of this article is voluntarily contributed by netizens, and the copyright belongs to the original author. This site does not assume corresponding legal responsibility. If you find any content suspected of plagiarism or infringement, please contact admin@php.cn