Home > Article > Backend Development > Reading and writing XML and DOM with PHP_PHP tutorial
Reading and writing Extensible Markup Language (XML) in PHP may seem a little scary. In fact, XML and all its related technologies can be scary, but reading and writing XML in PHP doesn't have to be a scary task. First, you need to learn a little about XML: what it is and what to do with it. Then, you need to learn how to read and write XML in PHP, and there are many ways to do this.
What is XML?
XML is a data storage format. It does not define what data is saved, nor does it define the format of the data. XML just defines tags and the attributes of those tags. Well-formed XML markup looks like this:
<ol class="dp-xml"><li class="alt"><span><span class="tag"><</span><span class="tag-name">name</span><span class="tag">><font class="Apple-style-span" color="#000000"><span class="Apple-style-span" style="font-weight: normal;">This test for php100</span></font></span><span class="tag"></</span><span class="tag-name">name</span><span class="tag">></span><span> </span></span></li></ol>
The
<ol class="dp-xml"><li class="alt"><span><span class="tag"><</span><span class="tag-name">powerUp</span><span> </span><span class="tag">/></span><span> </span></span></li></ol>
There is more than one way to write something in XML. For example, this tag forms the same output as the previous tag:
<ol class="dp-xml"><li class="alt"><span><span class="tag"><</span><span class="tag-name">powerUp</span><span class="tag">></span><span class="tag"></</span><span class="tag-name">powerUp</span><span class="tag">></span><span> </span></span></li></ol>
Attributes can also be added to XML tags. For example, this
<ol class="dp-xml"><li class="alt"><span><span class="tag"><</span><span class="tag-name">name</span><span> </span><span class="attribute">first</span><span>=</span><span class="attribute-value">"Jack"</span><span> </span><span class="attribute">last</span><span>=</span><span class="attribute-value">"Herrington"</span><span> </span><span class="tag">/></span><span> </span></span></li></ol>
Special characters can also be encoded in XML. For example, the & symbol can be encoded like this:
<ol class="dp-xml"><li class="alt"><span><span>& </span></span></li></ol>
An XML file containing tags and attributes is well-formed if it is formatted like the example, meaning that the tags are symmetrical and the characters are encoded correctly. Listing 1 is an example of well-formed XML.
List 1. XML book list example
<ol class="dp-xml"> <li class="alt"><span><span class="tag"><</span><span class="tag-name">books</span><span class="tag">></span><span> </span></span></li> <li><span><span class="tag"><</span><span class="tag-name">book</span><span class="tag">></span><span> </span></span></li> <li class="alt"><span><span class="tag"><</span><span class="tag-name">author</span><span class="tag">></span><span>Jack Herrington</span><span class="tag"></</span><span class="tag-name">author</span><span class="tag">></span><span> </span></span></li> <li><span><span class="tag"><</span><span class="tag-name">title</span><span class="tag">></span><span>PHP Hacks</span><span class="tag"></</span><span class="tag-name">title</span><span class="tag">></span><span> </span></span></li> <li class="alt"><span><span class="tag"><</span><span class="tag-name">publisher</span><span class="tag">></span><span>OReilly</span><span class="tag"></</span><span class="tag-name">publisher</span><span class="tag">></span><span> </span></span></li> <li><span><span class="tag"></</span><span class="tag-name">book</span><span class="tag">></span><span> </span></span></li> <li class="alt"><span><span class="tag"><</span><span class="tag-name">book</span><span class="tag">></span><span> </span></span></li> <li><span><span class="tag"><</span><span class="tag-name">author</span><span class="tag">></span><span>Jack Herrington</span><span class="tag"></</span><span class="tag-name">author</span><span class="tag">></span><span> </span></span></li> <li class="alt"><span><span class="tag"><</span><span class="tag-name">title</span><span class="tag">></span><span>Podcasting Hacks</span><span class="tag"></</span><span class="tag-name">title</span><span class="tag">></span><span> </span></span></li> <li><span><span class="tag"><</span><span class="tag-name">publisher</span><span class="tag">></span><span>OReilly</span><span class="tag"></</span><span class="tag-name">publisher</span><span class="tag">></span><span> </span></span></li> <li class="alt"><span><span class="tag"></</span><span class="tag-name">book</span><span class="tag">></span><span> </span></span></li> <li><span><span class="tag"></</span><span class="tag-name">books</span><span class="tag">></span><span> </span></span></li> </ol>
The XML in Listing 1 contains a list of books. The parent tag
If you think XML looks a lot like Hypertext Markup Language (HTML), you're right. XML and HTML are both markup-based languages and they have many similarities. However, it is important to note that while an XML document may be well-formed HTML, not all HTML documents are well-formed XML. The newline tag (br) is a good example of the difference between XML and HTML. This newline tag is well-formed HTML, but not well-formed XML:
<ol class="dp-xml"> <li class="alt"><span><span class="tag"><</span><span class="tag-name">p</span><span class="tag">></span><span>This is a paragraph</span><span class="tag"><</span><span class="tag-name">br</span><span class="tag">></span><span> </span></span></li> <li> <span>With a line break</span><span class="tag"></</span><span class="tag-name">p</span><span class="tag">></span><span> </span> </li> </ol>
This newline tag is well-formed XML and HTML:
<ol class="dp-xml"> <li class="alt"><span><span class="tag"><</span><span class="tag-name">p</span><span class="tag">></span><span>This is a paragraph</span><span class="tag"><</span><span class="tag-name">br</span><span> </span><span class="tag">/></span><span> </span></span></li> <li> <span>With a line break</span><span class="tag"></</span><span class="tag-name">p</span><span class="tag">></span><span> </span> </li> </ol>
To write HTML as well-formed XML, follow the W3C committee's Extensible Hypertext Markup Language (XHTML) standard (see Resources). All modern browsers can render XHTML. Furthermore, you can use XML tools to read XHTML and find the data in the document, which is much easier than parsing HTML.