Sample code for XML parsing DOM4J parsing-XML/RSS Tutorial-php.cn

Home

Backend Development

XML/RSS Tutorial

Sample code for XML parsing DOM4J parsing

黄舟

Mar 18, 2017 pm 05:00 PM

Foreword: The company's APP has been around for a long time. The previous interface results were processed through XML, and in the project, everyone processes XML in different ways. There are various methods, and there is no unified processing method, so it is very troublesome in application. Therefore, every time the poster is developing a project, in order to save his own time, he does not study other people's XML parsing methods. As long as he encounters XML, he will Parse using DOM4J.

There are many ways to parse XML, such as DOM, SAX, JDOM, etc. As for the usage and principles, I won’t go into details here (ps: the original poster doesn’t know the usage and principles either). This article mainly talks about the simple operation and usage of DOM4J.

DOM4J Introduction

dom4j is a Java XML API, an upgrade of jdom, used to read and write XML files. dom4j is a very excellent JavaXML API with excellent performance, powerful functions and extremely easy to use. Its performance exceeds the official dom technology of Sun Company. It is also an open source software and can be found on SourceForge.

Dom4j is an easy-to-use, open source library for XML, XPath and XSLT. It is applied to the Java platform, adopts the Java collection framework and fully supports DOM, SAX and JAXP.

Here is a simple example to introduce the usage of DOM4J.

Note: To use DOM4J to parse XML, you need to introduce the DOM4J jar package into the project

XML file

<Response T=&#39;203&#39; T1=&#39;6&#39; TaskID=&#39;20130800001963&#39; MediaNum=&#39;3&#39; Result = &#39;1&#39; Desc=&#39;查询成功!&#39; >
    <Media Name=&#39;IMG_20130425_141838.jpg&#39; Mediasource =&#39;1&#39; Type =&#39;1&#39; Code=&#39;/9j/4AAQSkZJRgABAQA0&#39;>图片1</Media>
    <Media Name=&#39;IMG_20130425_141838.jpg&#39; Mediasource =&#39;2&#39; Type =&#39;1&#39; Code=&#39;/9j/4AAQSkZJRgABAQA0&#39;>图片2</Media>
    <Media Name=&#39;IMG_20130425_141838.jpg&#39; Mediasource =&#39;3&#39; Type =&#39;1&#39; Code=&#39;/9j/4AAQSkZJRgABAQA0&#39;>图片3</Media>
</Response>

Detailed explanation of DOM4J usage
Step 1: Load the xml file

Loading xml can be divided into two main methods

1. Directly load the path address of the file

2. Load xml in the form of string (This method is mainly used in server return results)

1.1. Directly load the file path

   SAXReader reader = new SAXReader();
        Document document = null;
        try {
            document = reader.read(new File("E://CZBK//day01//caseUp.xml"));
        } catch (DocumentException e) {
            e.printStackTrace();
        }

1.2. Load xml in string form

SAXReader reader = new SAXReader();
    Document document = null;
    try {
        //result是需要解析的字符串 
        //解析字符串需要转换成流的形式，可以指定转换字符编码
        document = reader.read(new ByteArrayInputStream(result.getBytes("UTF-8")));
    } catch (DocumentException  e) {
        e.printStackTrace();
    }

Steps 2: Parse XML

Before parsing XML, let’s first introduce the structural name of XML. Knowing the following four questions is very helpful for parsing XML

What is Node? What is an element? What is attribute(attribute)? What is a text value?

Nodes: "Response", "Media" are called nodes

Element: It ends with a complete tag and is called an element, including the entire element content. For example: Picture 1》

Attribute: Attribute value of the node , add a description of the node content. For example: T='203' T1='6' TaskID='20130800001963' MediaNum='3' Result = '1' Desc='Query successful!'

Text value: "Picture 1" is called Text value.

In the project, it is nothing more than operating around elements, attributes and text values, so if you master the value methods of these three parts, you will also master XML parsing.

2.1. Get the root node

 //获取整个文档
        Element rootElement = document.getRootElement();

rootElement contains the content of the entire xml document, that is, all the content contained in the Response tag

2.2. Get the attribute value of the Response node

 //获取Response节点的Result属性值
        String responseResult = rootElement.attributeValue("Result");

2.3. Get the Media element

//获取第一个Media元素
        Element mediaElement = rootElement.element("Media");
        //获取所有的Media元素
        List allMeidaElements = rootElement.elements("Media");

2.4. Get the Media attribute value

 //获取第一个Media元素的Name属性值
        String mediaName = mediaElement.attributeValue("Name");
        //遍历所有的Media元素的Name属性值
        for (int i = 0; i < allMeidaElements.size(); i++) {
            Element element = (Element) allMeidaElements.get(i);
            String name = element.attributeValue("Name");
        }

2.5. Get the text value of the Media tag

    //获取第一个Meida元素的文本值
        String value = mediaElement.getText();

Complete code

import java.io.File;
import java.util.List;
import org.dom4j.Document;
import org.dom4j.DocumentException;
import org.dom4j.Element;
import org.dom4j.io.SAXReader;
public class Textxml {
    public void xml() {
        SAXReader reader = new SAXReader();
        Document document = null;
        try {
            document = reader.read(new File("E://CZBK//day01//caseUp.xml"));
        } catch (DocumentException e) {
            e.printStackTrace();
        }
        //获取整个文档
        Element rootElement = document.getRootElement();
        System.out.println("整个文档:"+rootElement.asXML());
        //获取Response节点的Result属性值
        String responseResult = rootElement.attributeValue("Result");
        System.out.println("Response节点的Result属性值:"+responseResult);
        //获取第一个Media元素
        Element mediaElement = rootElement.element("Media");
        System.out.println("第一个Media元素:"+mediaElement.asXML());
        //获取所有的Media元素
        List allMeidaElements = rootElement.elements("Media");
        //获取第一个Media元素的Name属性值
        String mediaName = mediaElement.attributeValue("Name");
        System.out.println("第一个Media元素的Name属性值:"+mediaName);
        //遍历所有的Media元素的Name属性值
        for (int i = 0; i < allMeidaElements.size(); i++) {
            Element element = (Element) allMeidaElements.get(i);
            String name = element.attributeValue("Name");
        }
        //获取第一个Meida元素的文本值
        String value = mediaElement.getText();
        System.out.println("第一个Meida元素的文本值:"+value);
    }
    public static void main(String[] args) {
        Textxml textxml = new Textxml();
        textxml.xml();
    }
}

Run results

整个文档:<Response T="203" T1="6" TaskID="20130800001963" MediaNum="3" Result="1" Desc="查询成功!">
<Media Name="IMG_20130425_141838.jpg" Mediasource="1" Type="1" Code="/9j/4AAQSkZJRgABAQA0">图片1</Media>
    <Media Name="IMG_20130425_141838.jpg" Mediasource="2" Type="1" Code="/9j/4AAQSkZJRgABAQA0">图片2</Media>
    <Media Name="IMG_20130425_141838.jpg" Mediasource="3" Type="1" Code="/9j/4AAQSkZJRgABAQA0">图片3</Media>
</Response>
Response节点的Result属性值:1
第一个Media元素:<Media Name="IMG_20130425_141838.jpg" Mediasource="1" Type="1" Code="/9j/4AAQSkZJRgABAQA0">图片1</Media>
第一个Media元素的Name属性值:IMG_20130425_141838.jpg
第一个Meida元素的文本值:图片1

Postscript

1. There are many XML parsing methods, and not all of them must be mastered. A kind of analysis is enough. As for the difference in performance, the main body of the building will not show it, and it cannot answer this question.

2. There are many APIs for DOM4J. This article only introduces the most basic ones. The most commonly used ones, if you are interested, you can research and use them yourself

The above is the detailed content of Sample code for XML parsing DOM4J parsing. For more information, please follow other related articles on the PHP Chinese website!

Statement

The content of this article is voluntarily contributed by netizens, and the copyright belongs to the original author. This site does not assume corresponding legal responsibility. If you find any content suspected of plagiarism or infringement, please contact admin@php.cn

Mastering Well-Formed XML: Best Practices for Data ExchangeMay 14, 2025 am 12:05 AM

Well-formedXMLiscrucialfordataexchangebecauseitensurescorrectparsingandunderstandingacrosssystems.1)Startwithadeclarationlike.2)Ensureeveryopeningtaghasaclosingtagandelementsareproperlynested.3)Useattributescorrectly,enclosingvaluesinquotesandavoidin

XML: Is it still used?May 13, 2025 pm 03:13 PM

XMLisstillusedduetoitsstructurednature,humanreadability,andwidespreadadoptioninenterpriseenvironments.1)Itfacilitatesdataexchangeinsectorslikefinance(SWIFT)andhealthcare(HL7).2)Itshuman-readableformataidsinmanualdatainspectionandediting.3)XMLisusedin

The Anatomy of an RSS Document: Structure and ElementsMay 10, 2025 am 12:23 AM

The structure of an RSS document includes three main elements: 1.: root element, defining the RSS version; 2.: Containing channel information, such as title, link, and description; 3.: Representing specific content entries, including title, link, description, etc.

Understanding RSS Documents: A Comprehensive GuideMay 09, 2025 am 12:15 AM

RSS documents are a simple subscription mechanism to publish content updates through XML files. 1. The RSS document structure consists of and elements and contains multiple elements. 2. Use RSS readers to subscribe to the channel and extract information by parsing XML. 3. Advanced usage includes filtering and sorting using the feedparser library. 4. Common errors include XML parsing and encoding issues. XML format and encoding need to be verified during debugging. 5. Performance optimization suggestions include cache RSS documents and asynchronous parsing.

RSS, XML and the Modern Web: A Content Syndication Deep DiveMay 08, 2025 am 12:14 AM

RSS and XML are still important in the modern web. 1.RSS is used to publish and distribute content, and users can subscribe and get updates through the RSS reader. 2. XML is a markup language and supports data storage and exchange, and RSS files are based on XML.

Beyond Basics: Advanced RSS Features Enabled by XMLMay 07, 2025 am 12:12 AM

RSS enables multimedia content embedding, conditional subscription, and performance and security optimization. 1) Embed multimedia content such as audio and video through tags. 2) Use XML namespace to implement conditional subscriptions, allowing subscribers to filter content based on specific conditions. 3) Optimize the performance and security of RSSFeed through CDATA section and XMLSchema to ensure stability and compliance with standards.

Decoding RSS: An XML Primer for Web DevelopersMay 06, 2025 am 12:05 AM

RSS is an XML-based format used to publish frequently updated data. As a web developer, understanding RSS can improve content aggregation and automation update capabilities. By learning RSS structure, parsing and generation methods, you will be able to handle RSSfeeds confidently and optimize your web development skills.

JSON vs. XML: Why RSS Chose XMLMay 05, 2025 am 12:01 AM

RSS chose XML instead of JSON because: 1) XML's structure and verification capabilities are better than JSON, which is suitable for the needs of RSS complex data structures; 2) XML was supported extensively at that time; 3) Early versions of RSS were based on XML and have become a standard.

See all articles