XML/RSS Tutorial

Example code sharing of some techniques for parsing XML and JSON content

黄舟

Mar 17, 2017 pm 05:08 PM

A few tips on parsing XML and JSON content

Overview

In the absence of a unified standard When a system connects to multiple external systems, it often encounters heterogeneous response data from the request interface. It may return XML or
JSON. In addition to the different return types, the content structure is also different. Taking the XML type as an example,
Interface 1 returns content

<root>
    <bizKey>16112638767472747178067</bizKey>
    <returnMsg>OK</returnMsg>
    <returnCode>200</returnCode>
    ...
</root>

Interface 2 returns content

<root>
    <bid>16112638767472747178068</bid>
    <note>成功</note>
    <returnStatus>1</returnStatus>
    ...
</root>

It is obviously unreasonable to process each format of content in our system. In the above content, we only care about three types of information, namely business ID, status value and description information. Can we abstract these three types of information?
After obtaining this information, we can perform business logic processing.

Parsing XML and JSON

According to business abstraction, we need to obtain three types of information from XML or JSON content. We will use XPath and JSONPath to parse here. . For example, to obtain important information about interface 1,
we can set three XPath expressions,

{
    bid: "/root/bizKey",
    code: "/root/returnCode",
    description: "/root/returnMsg"
}

bid, code and descriptionCorresponds to the field name defined by our system.
The same goes for parsing JSON content, except that the JSONPath expression is defined.

Process data content in two steps

Suppose we obtain bid,code and from the original XML and JSON data descriptionInformation,
obtained from interface 1

{
    bid: &#39;16112638767472747178067&#39;,
    code: &#39;200&#39;,
    description: &#39;OK&#39;
}

obtained from interface 2

{
    bid: &#39;16112638767472747178068&#39;,
    code: &#39;1&#39;,
    description: &#39;成功&#39;
}

Assume we get the status value from the interface 1 document 200 indicates that the request was successful , we learned from the interface 2 document that the status value 1 indicates that the request was successful. Although they all indicate that the request was successful, we still cannot
save them intact into our business-related tables (of course these response data It still needs to be saved in another record table, at least to facilitate troubleshooting).
Assume that our business-related tables are designed like this

##Field nameTypeDescriptionbidstringBusiness IDcodeintStatus value, 0=initial, 1=requesting, 2=success, 3=failuredescriptionstringDescription

因此，我们还必须定义规则把接口1返回的状态值200转换为我们系统的2，把接口2返回的状态值1转换为我们系统的2。
总结一下，两步走解析XML和JSON数据内容

根据XPath或者JSONPath表达式解析获得重要信息
根据规则转换状态值

第一步解析数据获得重要信息

以XML为例，

public class XmlParseUtils {
    private DocumentBuilderFactory dbFactory = DocumentBuilderFactory.newInstance();
    private XPathFactory xpathFactory = XPathFactory.newInstance();
    
    /**
     * 
     * @param param    数据内容
     * @param paths 表达式
     * @return
     * @throws Exception
     */
    public Map<String,Object> parse(String param, Map<String,String> paths) throws Exception{
        InputSource inputSource = new InputSource(new StringReader(param));
        Document document = dbFactory.newDocumentBuilder().parse(inputSource);
        Map<String,Object> map = Maps.newHashMap();
        for(String key : paths.keySet()) {
            XPath xpath = xpathFactory.newXPath();
            Node node = (Node) xpath.evaluate(paths.get(key), document, XPathConstants.NODE);
            if(node == null) {
                throw new Exception("node not found, xpath is " + paths.get(key));
            }
            map.put(key, node.getTextContent());
        }
        return map;
    }

}

parse函数的返回类型也可以是Map<string></string>，暂且用Map<string></string>。

第二步根据规则转换状态值

这一步稍稍有点麻烦，不过我们先不考虑代码实现，反正你能想到的可能别人已经帮你实现了。首先我们根据接口文档定义规则，写出规则表达式（或者其他的什么），
又是表达式。假设接口1的返回的状态值比较简单，只有200表示成功，其他情况都是失败，那么我们可以这样定义规则，

code.equals(&quot;200&quot;) ? 2: 3

或者

<#if code == "200">
2
<#else>
3
<#/if>

亦或者

function handle(arg) {
    if(arg == 200) {
        return 2;
    }
    return 3;
}
handle(${code})

以上根据同一份文档定义了三种不同类型的状态值转换规则，肯定需要三种不同的实现。下面一一说明，

三目表达式

code.equals("200") ? 2: 3是一个三目表达式，我们将使用jexl引擎来解析，利用第一步解析数据获得重要信息的结果，我们可以这样做

    public Object evaluateByJexl(String expression, Map<String,Object> context) {
        JexlEngine jexl = new JexlBuilder().create();
        JexlExpression e = jexl.createExpression(expression);
        JexlContext jc = new MapContext(context);
        return e.evaluate(jc);
    }

FreeMarker模板

<#if code == "200">
2
<#else>
3
<#/if>

处理这段模板我们可以这么做

    /**
     * 
     * @param param FreeMarker模板
     * @param context
     * @return
     * @throws Exception
     */
    public String render(String param, Map<String,Object> context) throws Exception {
        Configuration cfg = new Configuration();
        StringTemplateLoader stringLoader = new StringTemplateLoader();
        stringLoader.putTemplate("myTemplate",param);
        cfg.setTemplateLoader(stringLoader);
        Template template = cfg.getTemplate("myTemplate","utf-8");
        StringWriter writer = new StringWriter();
        template.process(context, writer);
        return writer.toString();
    }

如果FreeMarker模板比较复杂，从模板预编译成Template可能会消耗更多的性能，就要考虑把Template缓存起来。

JavaScript代码段

function handle(arg) {
    if(arg == 200) {
        return 2;
    }
    return 3;
}
handle(${code})

这段js代码中存在${code}，首先它需要使用FreeMarker渲染得到真正的handle方法的调用参数，然后

    public Object evaluate(String expression) throws Exception {
        ScriptEngineManager manager = new ScriptEngineManager();
        ScriptEngine engine = manager.getEngineByName("javascript");
        return engine.eval(expression);
    }

ScriptEngineManager的性能估计不太乐观，毕竟是一个语言的引擎。

不同转换规则实现的比较

类型	实现	优点	缺点
三目表达式	Jexl	简单（easy）	简单（simple）
FreeMarker模板	FreeMarker	--	--
JavaScript代码段	FreeMarker + ScriptEngine	直观	过程复杂，性能问题

看起来Freemarker是一个不错的选择。
至此两步走小技巧已经实现了，都是利用了现成的代码实现。

或许我们会这样的挑战，在做状态值转换时需要知道当前系统某个业务状态值的情况，
此时Freemarker表达式可能是这样的，

<# assign lastCode = GetLastCode(code)>
<#if lastCode == "2">
2
<#elseif code == "200">
2
<#else>
3
<#/if>

这里我们可以使用Freemarker的特性，自定义Java函数或工具类，在模板中调用。

The above is the detailed content of Example code sharing of some techniques for parsing XML and JSON content. For more information, please follow other related articles on the PHP Chinese website!

Statement

The content of this article is voluntarily contributed by netizens, and the copyright belongs to the original author. This site does not assume corresponding legal responsibility. If you find any content suspected of plagiarism or infringement, please contact admin@php.cn

RSS & XML: Understanding the Dynamic Duo of Web ContentApr 19, 2025 am 12:03 AM

RSS and XML are tools for web content management. RSS is used to publish and subscribe to content, and XML is used to store and transfer data. They work with content publishing, subscriptions, and update push. Examples of usage include RSS publishing blog posts and XML storing book information.

RSS Documents: The Foundation of Web SyndicationApr 18, 2025 am 12:04 AM

RSS documents are XML-based structured files used to publish and subscribe to frequently updated content. Its main functions include: 1) automated content updates, 2) content aggregation, and 3) improving browsing efficiency. Through RSSfeed, users can subscribe and get the latest information from different sources in a timely manner.

Decoding RSS: The XML Structure of Content FeedsApr 17, 2025 am 12:09 AM

The XML structure of RSS includes: 1. XML declaration and RSS version, 2. Channel (Channel), 3. Item. These parts form the basis of RSS files, allowing users to obtain and process content information by parsing XML data.

How to Parse and Utilize XML-Based RSS FeedsApr 16, 2025 am 12:05 AM

RSSfeedsuseXMLtosyndicatecontent;parsingtheminvolvesloadingXML,navigatingitsstructure,andextractingdata.Applicationsincludebuildingnewsaggregatorsandtrackingpodcastepisodes.

RSS Documents: How They Deliver Your Favorite ContentApr 15, 2025 am 12:01 AM

RSS documents work by publishing content updates through XML files, and users subscribe and receive notifications through RSS readers. 1. Content publisher creates and updates RSS documents. 2. The RSS reader regularly accesses and parses XML files. 3. Users browse and read updated content. Example of usage: Subscribe to TechCrunch's RSS feed, just copy the link to the RSS reader.

Building Feeds with XML: A Hands-On Guide to RSSApr 14, 2025 am 12:17 AM

The steps to build an RSSfeed using XML are as follows: 1. Create the root element and set the version; 2. Add the channel element and its basic information; 3. Add the entry element, including the title, link and description; 4. Convert the XML structure to a string and output it. With these steps, you can create a valid RSSfeed from scratch and enhance its functionality by adding additional elements such as release date and author information.

Creating RSS Documents: A Step-by-Step TutorialApr 13, 2025 am 12:10 AM

The steps to create an RSS document are as follows: 1. Write in XML format, with the root element, including the elements. 2. Add, etc. elements to describe channel information. 3. Add elements, each representing a content entry, including,,,,,,,,,,,. 4. Optionally add and elements to enrich the content. 5. Ensure the XML format is correct, use online tools to verify, optimize performance and keep content updated.

XML's Role in RSS: The Foundation of Syndicated ContentApr 12, 2025 am 12:17 AM

The core role of XML in RSS is to provide a standardized and flexible data format. 1. The structure and markup language characteristics of XML make it suitable for data exchange and storage. 2. RSS uses XML to create a standardized format to facilitate content sharing. 3. The application of XML in RSS includes elements that define feed content, such as title and release date. 4. Advantages include standardization and scalability, and challenges include document verbose and strict syntax requirements. 5. Best practices include validating XML validity, keeping it simple, using CDATA, and regularly updating.

See all articles

Hot AI Tools

Undresser.AI Undress

AI-powered app for creating realistic nude photos

AI Clothes Remover

Online AI tool for removing clothes from photos.

Undress AI Tool

Undress images for free

Clothoff.io

AI clothes remover

Video Face Swap

Swap faces in any video effortlessly with our completely free AI face swap tool!

Hot Article

Assassin's Creed Shadows: Seashell Riddle Solution

3 weeks agoByDDD

What's New in Windows 11 KB5054979 & How to Fix Update Issues

2 weeks agoByDDD

Where to find the Crane Control Keycard in Atomfall

3 weeks agoByDDD

Assassin's Creed Shadows - How To Find The Blacksmith And Unlock Weapon And Armour Customisation

1 months agoByDDD

Roblox: Dead Rails - How To Complete Every Challenge

3 weeks agoByDDD

Hot Tools

MinGW - Minimalist GNU for Windows

This project is in the process of being migrated to osdn.net/projects/mingw, you can continue to follow us there. MinGW: A native Windows port of the GNU Compiler Collection (GCC), freely distributable import libraries and header files for building native Windows applications; includes extensions to the MSVC runtime to support C99 functionality. All MinGW software can run on 64-bit Windows platforms.