Home >Backend Development >PHP Tutorial >php5 and xml example_PHP tutorial
http://trash.chregu.tv/phpconf2003/examples/
New XML features of PHP5
Author Christian Stocker Translation ice_berg16 (Scarecrow in search of dreams)
For readers
This article is for It is intended for PHP developers of all levels who are interested in PHP5's new XML features. We assume that the reader has basic knowledge of XML. However, if you are already using XML in your PHP, you will also benefit from this article.
Introduction
In today's Internet world, XML is no longer a buzzword, it has been widely accepted and standardized. Therefore, compared to PHP4, PHP5's support for XML has received more attention. In PHP4 you are almost always faced with non-standards, API breaks, memory leaks, and other incomplete functionality. Although some deficiencies have been improved in PHP 4.3, the developers decided to abandon the original code and rewrite the entire code in PHP5.
This article will introduce all the exciting new features of XML in PHP5 one by one.
XML of PHP4
Early PHP versions have already supported XML, and this is just a SAX-based interface that can easily parse any XML document. With the addition of the DOMXML extension module in PHP4, XML is better supported. Later XSLT was added as a supplement. Throughout the PHP 4 stage, other features such as HTML, XSLT and DTD validation were also added to the DOMXML extension. Unfortunately, since the XSLT and DOMXML extensions are always in the experimental stage and the API part has been modified more than once, they still cannot Installed by default. In addition, the DOMXML extension does not follow the DOM standard established by the W3C, but has its own naming method. Although this part has been improved in PHP 4.3 and many memory leaks and other features have been fixed, it has never developed to a stable stage, and some deep problems have become almost impossible to fix. Only the SAX extension is installed by default; some other extensions are never widely used.
For all these reasons, PHP's XML developers decided to rewrite the entire code in PHP5 and follow the usage standards.
XML for PHP5
All parts that support XML have been almost completely rewritten in PHP5. All XML extensions now are based on the GNOME project's LIBXML2 library. This will allow different extension modules to interoperate with each other, and core developers only need to develop on an underlying library. For example, all XML-related extensions can be improved by implementing complex memory management only once.
In addition to inheriting the famous SAX parser in PHP4, PHP5 also supports DOM following W3C standards and XSLT based on the LIBXSLT engine. At the same time, PHP's unique SimpleXML extension and standard-compliant SOAP extension are also added. As XML becomes more and more important, PHP developers decided to add more support for XML to the default installation method. This means you can now use SAX, DOM and SimpleXML, and these extensions will be installed on more servers. Then, support for XSLT and SOAP needs to be explicitly configured when PHP is compiled.
Data stream support
All XML extensions now support PHP data streams, even if you don't access them directly from PHP. For example, in PHP5 you can access data streams from a file or from a directive. Basically you can access PHP data streams anywhere you can access regular files.
Data flow was briefly introduced in PHP4.3, and has been further improved in PHP5, including file access, network access and other operations, such as sharing a set of functional functions. You can even use PHP code to implement your own data flow, so data access will become very simple. Please refer to the PHP documentation for more details on this part.
SAX
The full name of SAX is Simple API for XML. It is an interface for parsing XML documents and is based on callback form. SAX has been supported since PHP3, and there has not been much change until now. In PHP5, the API interface has not changed, so your code will still run. The only difference is that it is no longer based on the EXPAT library, but on the LIBXML2 library.
This change has brought about some problems with namespace support, which have been solved in LIBXML2.2.6 version. However, it was not solved in previous versions of LIBXML2, so if you use xml_parse_create_ns(); it is strongly recommended to install LIBXML2.2.6 on your system.
DOM
DOM (Document Object Model) is a set of standards developed by W3C for accessing XML document trees. In PHP4, you can use DOMXML to operate this. The main problem with DOMXML is that it does not comply with the standard naming method. And there has been a memory leak problem for a long time (PHP4.3 has fixed this problem).
The new DOM extension is based on W3C standards and contains method and attribute names. If you are familiar with the DOM in other languages, such as Javascript, writing similar functionality in PHP will be very easy. You don't have to check the documentation every time because the methods and parameters are the same.
Due to the new W3C standard, DOMXML based code will not run. The API in PHP is very different. But if your code uses a method naming method similar to the W3C standard, porting is not very difficult. You only need to modify the load function and save function and remove the underscore in the function name (the DOM standard uses capital letters for the first letter).Adjustments elsewhere are of course necessary, but the main logic can remain unchanged.
Reading DOM
I won’t explain all the features of DOM extensions in this article, that’s not necessary. Maybe you should bookmark the documentation for HTTP://www.w3.org/DOM... We will use the same one in most of the examples in this article XML file, there is a very simple RSS version on zend.com. Paste the text below into a text file and save it as articles.xml.
http://www.zend.com/zend/week/week172.php
http://www.zend.com/zend/tut/tut-hatwar3.php
To download this example To import a DOM object, first create a DOMDocument object and then load the XML file.
$dom = new DomDocument();
$dom->load("articles.xml");
As mentioned above, you can use PHP's data stream to load an XML document, you should write like this:
$dom->load("file:///articles.xml");
(or other type of data stream)
If you want to load the XML document To output to a browser or as a standard markup, use:
print $dom->saveXML();
If you want to save it as a file, please use:
print $dom-> save("newfile.xml");
(Note that doing this will send the file size to stdout)
Of course this example doesn't have much functionality, let's do something more useful. Let's get all the title elements. There are many ways to do it, the simplest is to use getElementsByTagName($tagname):
$titles = $dom->getElementsByTagName("title");
foreach($titles as $node) {
print $node->textContent . "n";
}
The textContent attribute is not a W3C standard. It allows us to quickly read all the text nodes of an element using the W3C standard. It is as follows:
$node->firstChild->data;
(At this time, you have to make sure that the firstChild node is the text node you need, otherwise you have to traverse all the child nodes to find it) .
Another thing to note is that getElementsByTagName() returns a DomNodeList object, rather than an array like get_elements_by_tagname() in PHP4, but as you can see in this example, you can use the foreach statement Easily traverse it. You can also use $titles->item(0) directly to access the node. This method will return the first title element.
Another way to get all title elements is to traverse from the root node. As you can see, this method is more complicated, but it is more flexible if you need more than just title elements.
foreach ($dom->documentElement->childNodes as $articles) {
//If the node is an element (nodeType == 1) and the name is item, continue looping
if ($articles ->nodeType == 1 && $articles->nodeName == "item") {
foreach ($articles->childNodes as $item) {
//If the node is an element and the name If it is title, print it.
if ($item->nodeType == 1 && $item->nodeName == "title") {
print $item->textContent . "n";
}
}
}
}
XPath
XPaht is like SQL for XML. Using XPath, you can query specific nodes that match some pattern syntax in an XML document. If you want to use XPath to get all title nodes, just do this:
$xp = new domxpath($dom);
$titles = $xp->query("/articles/item/title");
foreach ($titles as $node) {
print $node->textContent . "n";
}
?>
This is similar to using the getElementsByTagName() method, but Xpath is much more powerful. For example, if we have a title element that is a child element of article (rather than a child element of item), getElementsByTagName() will return it. Using the /articles/item/title syntax, we will only get the title element at the specified depth and position. This is just a simple example, if you go deeper it might look like this:
/articles/item[position() = 1]/title Returns all
/articles/item/title[@id of the first item element = '23'] Returns all titles containing the id attribute and the value is 23
/articles//title Returns all titles under the articles element (Translator's Note: //represents any depth)
You can also query containing Points to special sibling elements, elements with special text content, or using namespaces, etc. If you must query a large number of XML documents, learning to use XPath properly will save you a lot of time. It is simple to use, fast to execute, and requires less code than the standard DOM.
Write data into DOM
The Document Object Model is not only about reading and querying, you can also operate and write.(The DOM standard is a bit lengthy because the writers wanted to support every environment imaginable, but it works very well). Take a look at the following example, which adds a new element to our article.xml file.
$item = $dom->createElement("item");
$title = $dom->createElement("title");
$titletext = $dom->createTextNode(" XML in PHP5");
$title->appendChild($titletext);
$item->appendChild($title);
$dom->documentElement->appendChild($item );
print $dom->saveXML();
First, we create all the required nodes, an item element, a title element and a text node containing the item title, and then we add all Link the nodes, add the text node to the title element, add the title element to the item element, and finally insert the item element into the articles root element. We now have a new list of articles in our XML document.
Extension class (class)
Okay, the above examples can be done using DOMXML extension under PHP4 (just the API is slightly different). Being able to extend the DOM class yourself is a new feature of PHP5, which makes it easier to write More readable code becomes possible. The following is the entire example rewritten using the DOMDocument class:
class Articles extends DomDocument {
function __construct() {
//Must be called!
parent::__construct();
}
function addArticle($title) {
$item = $this->createElement("item");
$titlespace = $this->createElement("title");
$titletext = $this->createTextNode($title);
$titlespace->appendChild($titletext);
$item->appendChild($titlespace);
$this->documentElement- >appendChild($item);
}
}
$dom = new Articles();
$dom->load("articles.xml");
$dom- >addArticle("XML in PHP5");
print $dom->save("newfile.xml");
HTML
An often unnoticed feature of PHP5 is the libxml2 library With HTML support, you can not only use the DOM extension to load well-formed XML documents, but also load non-well-formed HTML documents, treating them as standard DOMDocument objects, using All available methods and features, such as XPath and SimpleXML.
HTML’s capabilities are particularly useful when you need to access content on a site that you have no control over. With the help of XPath, XSLT or SimpleXML, you save a lot of code, like using regular expressions to compare strings or a SAX parser. This is especially useful when the HTML document is not well structured (a frequent problem!).
The following code obtains and parses the homepage of php.net and returns the content of the first title element.
$dom = new DomDocument();
$dom->loadHTMLFile("http://www.php.net/");
$title = $dom->getElementsByTagName("title ");
print $title->item(0)->textContent;
Please note that your output may contain errors when the specified element is not found. If your website is still using PHP to output HTML4 code, there is good news to tell you that the DOM extension can not only load HTML documents, but also save them as HTML4 format files. After you add the DOM document, use $dom->saveHTML() to save it. It should be noted that in order to make the output HTML code conform to W3C standards, it is best not to use tidy extension? (tidy extension). The HTML supported by the Libxml2 library does not take into account every eventuality and does not handle input in non-universal formats well.
Validation
Validation of XML documents is becoming more and more important. For example, if you obtain an XML document from some foreign resources, you need to check whether it conforms to a certain format before you process it. Fortunately you don't need to write your own validator in PHP, since you can do it using one of the three most widely used standards (DTD, XML Schema or RelaxNG). .
DTD is a standard born in the SGML era. It lacks some new features of XML (such as namespaces), and because it is not written in XML, it is also difficult to parse and convert.
XML Schemai is a standard developed by W3C. It is widely used and contains almost everything needed to verify XML documents.
RelaxNG is the counterpart to the complex XML Schema standard and was created by the Libertarians. Since it is easier to implement than XML Schema, more and more programs are beginning to support RelaxNG
If you don’t have a legacy plan Documents or very complex XML documents, then use RelaxNG. It is relatively simple to write and read, and more and more tools support it. There is even a tool called Trang that can automatically create a RelaxNG document from an XML template.And only RelaxNG (and the aging DTDS) is fully supported by libxml2, although libxml2 is also about to fully support ML Schema.
The syntax for validating XML documents is quite simple:
$dom->validate('articles.dtd');
$dom->relaxNGValidate('articles.rng');
$dom ->schemaValidate('articles.xsd');
Currently, all of these will simply return true or false, and errors will be output as PHP warnings. Obviously it's not a good idea to return user-friendly information, and it will be improved in PHP 5.0 and later versions. How exactly this will be implemented is still under discussion, but error reporting will definitely be handled better.
SimpleXML
SimpleXML is the last member added to PHP's XML family. The purpose of adding the SimpleXML extension is to provide a simpler way to access XML documents using standard object properties and iterators. The extension doesn't have many methods, but it's still quite powerful. Retrieving all title nodes from our document requires less code than before.
$sxe = simplexml_load_file("articles.xml");
foreach($sxe->item as $item) {
print $item->title ."n";
}
What are you doing? First load articles.xml into a SimpleXML object. Then get all the item elements in $sxe, and finally $item->title returns the content of the title element, that's it. You can also use an associative array to query attributes, using: $item->title['id'].
See, this is really amazing. There are many different ways to get the results we want. For example, $item->title[0] returns the same result as in the example. On the other hand, foreach($sxe->item->title as $item) only returns the first title, not all title elements in the document. (Just like I expected in XPath).
SimpleXML is actually the first extension to use the new features of Zend Engine 2. Therefore, it has become a testing point for these new features. You must know that bugs and unpredictable errors are not uncommon during the development stage.
In addition to the method of traversing all nodes used in the above example, there is also an XPath interface in SimpleXML, which provides a simpler way to access a single node.
foreach($sxe->xpath('/articles/item/title') as $item) {
print $item . "n";
}
It is undeniable that this code Not shorter than the previous example, but providing more complex or deeply nested XML documents, you will find that using XPath with SimpleXML will save you a lot of typing.
Write data to SimpleXML documents
Not only can you parse and read SimpleXML, but you can also change SimpleXML documents. At least we add some extensions:
$sxe->item->title = "XML in PHP5"; //New content of the title element.
$sxe->item->title['id'] = 34; // New attributes of the title element.
$xmlString = $sxe->asXML(); // Return the SimpleXML object as a serialized XML string
print $xmlString;
Interoperability
Because SimpleXML is also based on With the libxml2 library, you can easily convert SimpleXML objects into DomDocument objects with little impact on speed. (The document does not need to be copied internally). Thanks to this mechanism, you have the best of both objects. Use a tool that suits the job at hand. It is used like this:
$sxe = simplexml_import_dom($dom) ;
$dom = dom_import_simplexml($sxe);
XSLT
XSLT is a language used to convert XML documents into other XML documents. XSLT itself is written in XML and belongs to the functional language family. Program processing is different from object-oriented languages (like PHP). There are two XSLT processors in PHP4: Sablotron (in the widely used XSLT extension) and Libxslt (in the domxml extension). These two APIs are not compatible with each other, and their usage methods are also different. PHP5 only supports the libxslt processor, which was chosen because it is based on Libxml2 and therefore more consistent with PHP5's XML concept.
Theoretically it is possible to bind Sablotron to PHP5, but unfortunately no one has done it. Therefore, if you are using Sablotron, you have to switch to the libxslt processor in PHP5. Libxslt is Sablotron with Javascript exception handling support, and can even use PHP's powerful data flow to re-implement Sablotron's unique scheme handlers. Additionally, libxslt is one of the fastest XSLT processors, so you get the speed boost for free. (Execution speed is twice that of Sablotron).
Like the other extensions discussed in this article, you can exchange XML documents between XSL extensions, DOM extensions and vice versa. In fact, you have to do this because the EXT/XSL extension does not load and save XML documents. The interface can only use DOM extensions. When you first learn XSLT transformation, you don't need to master too much content. There is no W3C standard here because this API is "borrowed" from Mozilla.
First you need an XSLT stylesheet, paste the following text into a new file and save articls.xsl
Then call it from a PHP script::
/* Load XML and XSL documents into DOMDocument Object*/
$xsl = new DomDocument();
$xsl->load("articles.xsl");
$inputdom = new DomDocument();
$inputdom-> load("articles.xml");
/* Create an XSLT processor and import the style sheet*/
$proc = new XsltProcessor();
$xsl = $proc->importStylesheet($ xsl);
$proc->setParameter(null, "titles", "Titles");
/* Convert and output XML document */
$newdom = $proc->transformToDoc($ inputdom);
print $newdom->saveXML();
?>
The above example first uses the DOM method load() to load the XSLT style sheet articles.xsl, and then creates a new XsltProcessor object, which will be used later to use the XSLT style sheet object. The parameters can be set like this: setParameter(namespaceURI, name, value). Finally, the XsltProcessor object uses transformToDoc($inputdom) to start the conversion and return a new DOMDocument object.
. The advantage of this API is that you can use the same stylesheet to transform many XML documents, just load it once and reuse it, because the transformToDoc() function can be applied to different XML documents.
In addition to transformToDoc(), there are two methods for conversion: transformToXML($dom) returns a string, and transformToURI($dom, $uri) saves the converted document to a file or a PHP data stream. Note that if you want to use an XSLT syntax such as or indent="yes", you cannot use transformToDoc() because the DOMDocument object cannot save this information, only when you save the transformed results directly to a string or file. Only then can you do this.
Calling PHP functions
The last newly added feature of the XSLT extension is the ability to call any PHP function within the XSLT style sheet. Orthodox XML supporters will definitely not like this feature (such style sheets are a bit complicated and very Easy to confuse logic and design), but it is very useful in some places. XSLT becomes very limited when it comes to functions, and even trying to output a date in different languages is very cumbersome. But with this feature, handling this is as easy as using just PHP. Here is the code to add a function to XSLT:
function dateLang () {
return strftime("%A");
}
$xsl = new DomDocument();
$xsl ->load("datetime.xsl");
$inputdom = new DomDocument();
$inputdom->load("today.xml");
$proc = new XsltProcessor() ;
$proc->registerPhpFunctions();
//Load the document and use $xsl to process it
$xsl = $proc->importStylesheet($xsl);
/* Convert And output the XML document */
$newdom = $proc->transformToDoc($inputdom);
print $newdom->saveXML();
?>
The following is the XSLT style sheet datetime.xsl, it will call this function.
The following is the XML document to be converted using the style sheet, today.xml (Similarly, articles.xml will also get the same result).
The above style sheet, PHP script and all XML files will output the name of the week in the language of the current system setting. You can add more parameters to php:function(), and the added parameters will be passed to the PHP function. There is a function php:functionString(). This function automatically converts all input parameters into strings, so you don't need to convert them in PHP.
Note that you need to call $xslt->registerPhpFunctions() before transforming, otherwise the PHP function calls will not be executed for security reasons (do you always trust your XSLT stylesheet?). At present, the access system has not been implemented, maybe this function will be implemented in the future version of PHP5.
Summary
PHP's support for XML has taken a big step forward. It is standard-compliant, powerful, interoperable, installed as a default option, and has been authorized for use. The newly added SimpleXML extension provides a simple and fast way to access XML documents, which can save you a lot of code, especially when you have structured documents or can use the powerful XPath.
Thanks to libxml2, the underlying library used by the PHP5 XML extension, validating XML documents using DTD, RelaxNG or XML Schema is now supported.
XSL support has also been revamped, now using the Libxslt library, which has greatly improved performance over the original Sablotron library. Moreover, calling PHP functions inside the XSLT style sheet allows you to write more powerful XSLT code.
If you have used XML in PHP4 or other languages, you will like the XML features of PHP5. XML has changed a lot in PHP5, is compliant with standards, and is equivalent to other tools and languages.
Link
PHP 4 related
Domxml extension: http://www.php.net/domxml/
Sablotron extension: http://www.php.net/xslt/
Libxslt : http://www.php.net/manual/en/functi...-stylesheet.php
PHP 5 related
SimpleXML: http://www.php.net/simplexml/
Streams : http://www.php.net/manual/en/ref.stream.php
Standard
DOM: http://www.w3.org/DOM
XSLT: http://www .w3.org/TR/xslt
XPath: http://www.w3.org/TR/xpath
XML Schema: http://www.w3.org/XML/Schema
RelaxNG: http://relaxng.org/
Xinclude: http://www.w3.org/TR/xinclude/
Tools
Libxml2, the underlying library: http://xmlsoft.org/
Trang, a Schema/RelaxNG/etc converter: http://www.thaiopensource.com/relaxng/trang.html
About the author
Christian Stocker is the founder and CEO of Bitflux GmbH in Zurich. He is an XSL , maintainer of the DOM and imagick extensions, co-author of the German book PHP de Luxe, and also works on other open source projects such as Bitflux Editor and Popoon. You can contact him at chregu@php.net.