Home > Article > Backend Development > Comparison of XMLTextReader and XmlDocument for reading XML files
I saw an article on the Internet and tried it myself. Sure enough, xmlTextReader is faster!
The XMLTextReader class included in the System.XML namespace of the .NET framework does not require high system resource requirements and can quickly read data from XML files. Use the XMLTextReader class to read data from XML files and convert it into HTML format for output in the browser.
Before reading this article, readers need to understand some basic knowledge: XML, HTML, C# programming language, and some knowledge of .NET, especially the asp.net framework.
Microsoft's .NET framework provides developers with many development conveniences. As the importance of XML continues to grow, developers are looking forward to the development of a complete set of powerful XML tools. The .NET framework has lived up to our expectations and organized the following classes for XML in the System.XML namespace:
XMLTextReader------provides fast, one-way, Access XML data in an unbuffered manner. (One-way means you can only read the XML file from front to back, but not in reverse)
XMLValidatingReader------Used with the XMLTextReader class, it provides verification of DTD, XDR and ability.
XMLDocument------follows the primary and secondary standards of the W3C Document Object Model specification to achieve random and cached access to XML data. The first level contains the most basic parts of the DOM, while the second level adds a variety of improvements, including added support for namespaces and cascading diagrams (CSS).
XMLTextWriter------Generate XML files that comply with the W3C XML 1.0 specification.
This article mainly talks about the first class XMLTextReader. The purpose of this class is to quickly read data from XML files without imposing high demands on system resources (mainly including memory and processor time). requirements. Under the control of the parent program, it implements this working process by gradually operating the XML file by processing only one node at a time. In each node of the XML file, the parent program can determine the type of the node, its attributes and data (if any), and other information about the node. Based on this information, the parent program can choose whether to process this node or ignore the node's information to meet the needs of various application requests. This is called a pull processing model because the parent program makes the request and extracts the individual nodes from the XML file and then processes it or not processes it as needed.
We can compare the XMLTextReader class with the XML Simple Application Programming Interface, or SAX, which is another technology for reading XML data that is very popular among programmers. XMLTextReader and SAX are very similar in that they can quickly read data from XML files without taking up a lot of system resources. However, unlike the extraction model of XMLTextReader, SAX uses a push model: the XML processor uses "events" to inform the host application which node data is available and which cannot be obtained; as needed, the host program responds accordingly react or ignore it. In other words, the data is pushed from the SAX handler to the host. Programmers are bound to debate whether the pull-out or push-in processing models have more advantages, but there is no denying that both models work well. The .NET Framework does not support SAX, but you can use existing SAX tools, such as the MSXML parser, with your .NET applications.
The XMLTextReader class has constructors to accommodate a variety of situations, such as reading data from an existing data stream or Uniform Resource Locator. Most commonly, you may want to read XML data from a file, and there is a corresponding constructor to serve this. Here's an example (all my code examples are in C#, they're easy to convert if you prefer to use VISUAL BASIC).
XMLTextReader myReader; myReader = New XMLTextReader("c:\data\sales.XML")
Create a loop called the Read() method. The return value of this method is always true until the bottom of the file is reached, when the return value becomes false. In other words, the loop starts at the beginning of the file and reads in all nodes, one at a time, until it reaches the end of the file:
While (myReader.Read()) { ... // 在这里处理每个节点. ... }
After each successful call to Read(), the XMLTextReader instantiator Contains information about the current node (that is, the node just read from the file). We can obtain the above information from the members of XMLTextReader, just as described in Table 1; and determine the type of the current node through the NodeType attribute. Based on the node type, the program code can read the node data, check whether it has attributes, and whether to ignore it or perform corresponding operations and processing according to the needs of the program.
When using the NodeType attribute, it is very important to understand how nodes are related to XML units. For example, look at the following XML element:
<city>Chongqing</city>
XMLtextReader treats this element as 3 nodes, in the following order:
1. The
2.文本数据“Chongqing”被读为类型为XMLNodeType.Text的节点。数据“Chongqing ” 可从XMLTextReader 的Value属性中取得。
3.</city>标签被读为类型为XMLNodeType.EndElement 节点。同样,元素的名称“city”可从XMLTextReader的Name属性中获得。
这是 3 种重要的节点类型,其它的类型在.NET的说明文档中有详细说明,请大家参阅相关资料。
如果XMLTextReader遇到一个错误, 例如出现违反XML句法的情况,它抛出一个System.XML.XMLException类型的异常。使用这个类的代码应该总是被保护 ( 在Try……Catch块中),就像你以后在演示程序中看到的一样。
本文只是一篇相当简单的介绍XMLTextReader 类的文章,XMLTextReader类有相当多的成员,在这里不可能一一述及。当读入XML数据时,XMLTextReader能提供相当强的灵活性。即便如此,我仍然进行了大量的论述,以保证读者能编制程序来实现现实世界中经常要求完成的任务,也就是从一个XML文件读取数据然后以HTML的格式输出,从而实现在浏览器中的显示。
这个ASP.NET程序(脚本)在服务器上运行并产生一个HTML页面返回浏览器。这段脚本程序在代码段 1 给出,它用来工作使用的 XML 数据文件在代码段 2给出。你能看到这个 XML 文件包含一份表示联系关系的列表;程序的目标即是将这个列表显示出来,为了更容易我们观察,这些列表已经被格式化了。
运行程序:
1. 将代码段1存为XMLTextReader.ASPx文件,将代码段2存为XMLData.XML文件。
2. 把这两个文件都放在一个已经安装好.NET 框架的网络服务器的虚拟文件夹中。
3. 打开 Internet Explorer 并且浏览这个ASPx文件,例如,在一个局域网服务器上, URL 将是 http://localhost/xmltextreader.ASPx ;。
程序工作的大部分都由XMLDisplay 类来做,尤其是被PRocessXML()方法完成的。它每次读取一个节点XML数据,对于感兴趣的元素,节点数据和后跟冒号的节点名将和相应的HTML格式化标签一起写入输出结果中。在这阶段,“输出结果”由一个HTML文本暂时储存在其中的StringBuilder对象构成。
ProcessXML()方法是从LoadDocument()方法调用的。这个方法执行的任务是产生一个XMLTextReader实例化程序并在调用ProcessXML之前装载XML文件。它同时也处理异常,随后产生错误的信息并在浏览器中显示出来。最终该方法返回一个字符串,这个字符串或者包含产生的HTML内容,或者如果异常发生的话就包含出错信息,。
程序执行以Page_Load()程序开始,当浏览器请求浏览这个页面时,这一步会自动执行。这里的代码实例化了XMLDisplay 类并调用它的LoadDocument()方法。如果一切运行正常的话,格式化的HTML形式的返回值将被拷贝到页面的一个<div>标签中,生成的HTML文档被送回到浏览器中并显示出来。
其他的.NET 框架的类,比如XMLDocument类在读取XML数据方面表现如何呢?XMLDocument 类与XMLTextReader 类不同,它在存储器中创建整个XML文档的节点树。这样就可以随机的获得XML数据(与XMLTextReader 类获得数据的线性方式正好相反),并且在修改XML文件的数据和结构时,具有非常完美的灵活性。另外,XMLDocument允许执行XSLT 转变,不过,这些额外的功能是以运行速度的降低和系统资源的更多占用为代价的。
代码段1:XmlTextReader.aspx
<%@ Import Namespace="System.Xml" %> <script language="C#" runat=server> public class XmlDisplayfile://这个类读入并处理XML文件。{ public string LoadDocument(String XmlFileName) { XmlTextReader xmlReader = null; StringBuilder html = new StringBuilder(); try { file://创建XMLTextReader的实例。xmlReader = new XmlTextReader(XmlFileName); // 处理XML文件html.Append(ProcessXml(xmlReader)); } catch (XmlException ex){ html.Append("发生一个XML异常:" + ex.ToString());} catch (Exception ex){html.Append("发生一个普通异常:" + ex.ToString());} finally {if (xmlReader != null)xmlReader.Close();}return html.ToString();} private string ProcessXml(XmlTextReader xmlReader) {StringBuilder temp = new StringBuilder(); file://这个方法读入XML文件并生成输出的HTML文档。 while ( xmlReader.Read() ) { // 处理一个元素节点的起始。 if (xmlReader.NodeType == XmlNodeType.Element) { file://忽略<people>和<person>元素if ((xmlReader.Name != "person") && (xmlReader.Name != "people")) {file://如果是一个<category>元素,开始一个新的段落if ( xmlReader.Name == "category" )temp.Append("<p>"); file://添加元素名到输出中temp.Append( xmlReader.Name + ": " );}} // 处理文本节点 else if (xmlReader.NodeType == XmlNodeType.Text) temp.Append(xmlReader.Value + "<br>"); file://处理元素节点的结尾else if (xmlReader.NodeType == XmlNodeType.EndElement) { file://如果是<email>节点,添加结束段落的标记if ( xmlReader.Name == "email" ) temp.Append("</p>"); } } //结束while循环 return temp.ToString(); } file://结束ProcessXML方法 } file://结束XmlDisplay类 private void Page_Load(Object sender, EventArgs e){ file://创建XmlDisplay类的实例XmlDisplay XmlDisplayDemo = new XmlDisplay(); output.InnerHtml = XmlDisplayDemo.LoadDocument(Server.MapPath("XMLData.xml")); }</script><html><head></head><body><h2>演示XmlTextReader类</h2> <div id="output" runat="server"/></body></html>
1 static void Main(string[] args) 2 { 3 DateTime d1 =DateTime.Now; 4 XmlDocumentTest(); 5 DateTime d2 =DateTime.Now; 6 TimeSpan ts =d2-d1 ; 7 8 Console.WriteLine(ts.TotalMilliseconds) ; 9 Console.Read() ; 10 11 } 12 13 14 public static string XmlFileName = "../../XML/1.xml"; 15 16 private static void XmlTextReaderTest() 17 { 18 XmlTextReader reader = new XmlTextReader(XmlFileName); 19 while (reader.Read() ) 20 { 21 bool exit =false; 22 switch(reader.NodeType) 23 { 24 case XmlNodeType.Element : 25 break; 26 case XmlNodeType.Text : 27 if (reader.Value=="last") 28 { 29 exit=true; 30 } 31 break; 32 case XmlNodeType.EndElement : 33 break; 34 default: 35 break; 36 } 37 if(exit) 38 { 39 return; 40 41 } 42 43 } 44 } 45 46 private static void XmlDocumentTest() 47 { 48 XmlDocument xd =new XmlDocument() ; 49 xd.Load(XmlFileName) ; 50 XmlNode node = xd.SelectSingleNode("/people/person[category='last']"); 51 Console.Write(node.Name) ; 52 }
以上就是XMLTextReader和XmlDocument读取XML文件的比较的内容,更多相关内容请关注PHP中文网(www.php.cn)!