Home  >  Article  >  Backend Development  >  Java&Xml Tutorial (5) Using SAX to parse XML files

Java&Xml Tutorial (5) Using SAX to parse XML files

黄舟
黄舟Original
2017-02-22 14:42:061672browse

The Java SAX parsing mechanism provides us with a series of APIs to process XML files. The SAX parsing method is different from the DOM parsing method. It does not load the entire content of the XML file at once, but loads it in parts continuously.

javax.xml.parsers.SAXParserThe class provides some functions and uses event processing to parse XML documents. This class implements the XMLReader interface and provides overloaded parse( ) method reads XML documents from File, InputStream, SAX InputSource and URI strings.
The actual XML parsing work is completed by the Handler class. We need to create our own Handler class, which requires us to implement the org.xml.sax.ContentHandler interface. This interface contains callback methods for receiving notifications when events occur, such as StartDocument, EndDocument, StartElement, EndElement, CharacterData, etc.

org.xml.sax.helpers.DefaultHandler Provides a default implementation of the ContentHandler interface, so we can inherit this class to implement our own processing class. Inheriting this class is a wise choice because we may only need to implement a few methods. Inheriting this class can ensure the simplicity and maintainability of the code.
The following is the XML document we want to parse:
employees.xml

<?xml version="1.0" encoding="UTF-8"?><Employees>
    <Employee id="1">
        <age>29</age>
        <name>Pankaj</name>
        <gender>Male</gender>
        <role>Java Developer</role>
    </Employee>
    <Employee id="2">
        <age>35</age>
        <name>Lisa</name>
        <gender>Female</gender>
        <role>CEO</role>
    </Employee>
    <Employee id="3">
        <age>40</age>
        <name>Tom</name>
        <gender>Male</gender>
        <role>Manager</role>
    </Employee>
    <Employee id="4">
        <age>25</age>
        <name>Meghna</name>
        <gender>Female</gender>
        <role>Manager</role>
    </Employee></Employees>

The content of this XML file stores some employee information. Each employee contains the id attribute and age, name, gender, and role fields. .
We will use the SAX parsing mechanism to process the XML file and create a list of employee objects.
We use the Employee class to abstract employee information: Employee.java

package com.journaldev.xml;public class Employee {
    private int id;    
    private String name;    
    private String gender;    
    private int age;    
    private String role;    
    public int getId() {        
    return id;
    }    public void setId(int id) {        
    this.id = id;
    }    public String getName() {        
    return name;
    }    public void setName(String name) {        
    this.name = name;
    }    public String getGender() {        
    return gender;
    }    public void setGender(String gender) {        
    this.gender = gender;
    }    public int getAge() {        
    return age;
    }    public void setAge(int age) {        
    this.age = age;
    }    public String getRole() {        
    return role;
    }    public void setRole(String role) {        
    this.role = role;
    }    @Override
    public String toString() {        
    return "Employee:: ID="+this.id+" Name=" + this.name + " Age=" + this.age + " Gender=" + this.gender +       
             " Role=" + this.role;
    }

}

Then inherit the DefaultHandler class to create your own Handler class MyHandler.java

package com.journaldev.xml.sax;
import java.util.ArrayList;
import java.util.List;
import org.xml.sax.Attributes;
import org.xml.sax.SAXException;
import org.xml.sax.helpers.DefaultHandler;
import com.journaldev.xml.Employee;
public class MyHandler extends DefaultHandler {

    //List to hold Employees object
    private List<Employee> empList = null;    
    private Employee emp = null;    
    //getter method for employee list
    public List<Employee> getEmpList() {        
    return empList;
    }    
    boolean bAge = false;    
    boolean bName = false;    
    boolean bGender = false;    
    boolean bRole = false;    
    @Override
    public void startElement(String uri, String localName, String qName, Attributes attributes)            
    throws SAXException {        
    if (qName.equalsIgnoreCase("Employee")) {            
    //create a new Employee and put it in Map
            String id = attributes.getValue("id");            
            //initialize Employee object and set id attribute
            emp = new Employee();
            emp.setId(Integer.parseInt(id));            
            //initialize list
            if (empList == null)
                empList = new ArrayList<>();
        } else if (qName.equalsIgnoreCase("name")) {            
        //set boolean values for fields, will be used in setting Employee variables
            bName = true;
        } else if (qName.equalsIgnoreCase("age")) {
            bAge = true;
        } else if (qName.equalsIgnoreCase("gender")) {
            bGender = true;
        } else if (qName.equalsIgnoreCase("role")) {
            bRole = true;
        }
    }    @Override
    public void endElement(String uri, String localName, String qName) throws SAXException {        
    if (qName.equalsIgnoreCase("Employee")) {            
    //add Employee object to list
            empList.add(emp);
        }
    }    @Override
    public void characters(char ch[], int start, int length) throws SAXException {        
    if (bAge) {            
    //age element, set Employee age
            emp.setAge(Integer.parseInt(new String(ch, start, length)));
            bAge = false;
        } else if (bName) {
            emp.setName(new String(ch, start, length));
            bName = false;
        } else if (bRole) {
            emp.setRole(new String(ch, start, length));
            bRole = false;
        } else if (bGender) {
            emp.setGender(new String(ch, start, length));
            bGender = false;
        }
    }
}

The MyHandler class holds a storage for Employee objects List reference, which has only one corresponding getter method. The Employee object is added to the List object in the event handling function. The Boolean type variables related to the Employee object and several of its fields are also defined in the MyHandler class to create the Employee object. When all properties of the Employee object are set, It will be added to the list.
We have rewritten several important methods startElement(), endElement() and characters().
When SAXParser starts parsing the document and encounters the start tag of the element, the startElement() method will be called. We override this method and use boolean type variables to distinguish element categories. It is also in this method that we create the Employee object when the Employee tag starts.
The characters() method will be called when SAXParser encounters string data in an element. We use boolean type fields to assign values ​​to the properties of the Employee object. The
endElement() method will be called when SAXParser encounters the XML end tag. Here we add the Employee object to the List object.
In the following test program, we use MyHandler to parse the XML document to generate a List of Employee objects.
XMLParserSAX.java

package com.journaldev.xml.sax;
import java.io.File;
import java.io.IOException;
import java.util.List;
import javax.xml.parsers.ParserConfigurationException;
import javax.xml.parsers.SAXParser;
import javax.xml.parsers.SAXParserFactory;
import org.xml.sax.SAXException;
import com.journaldev.xml.Employee;
public class XMLParserSAX {

    public static void main(String[] args) {
    SAXParserFactory saxParserFactory = SAXParserFactory.newInstance();
    try {
        SAXParser saxParser = saxParserFactory.newSAXParser();
        MyHandler handler = new MyHandler();
        saxParser.parse(new File("/Users/pankaj/employees.xml"), handler);
        //Get Employees list
        List<Employee> empList = handler.getEmpList();
        //print employee information
        for(Employee emp : empList)
            System.out.println(emp);
    } catch (ParserConfigurationException | SAXException | IOException e) {
        e.printStackTrace();
    }
    }

}

Running program output:

Employee:: ID=1 Name=Pankaj Age=29 Gender=Male Role=Java DeveloperEmployee:: ID=2 Name=Lisa Age=35 
Gender=Female Role=CEOEmployee:: ID=3 Name=Tom Age=40 
Gender=Male Role=ManagerEmployee:: ID=4 Name=Meghna Age=25 Gender=Female Role=Manager

The SAXParserFactory class provides a factory method to obtain a SAXParser instance. When calling the parse method of the SAXParser object, a Handler object is passed in to handle the callback. event. The SAXParser parsing mechanism is a bit complex at first, but when you are committed to processing large XML documents, it provides a more efficient parsing mechanism than DOM parsing.
Original address: http://www.php.cn/

Java SAX parsing mechanism provides us with a series of APIs to process XML files. SAX parsing is different from DOM parsing. It The contents of the XML file are not loaded all at once, but are loaded in parts continuously.

javax.xml.parsers.SAXParserThe class provides some functions to parse XML documents using event processing. This class implements the XMLReader interface and provides an overloaded parse() method from Read XML documents from File, InputStream, SAX InputSource and URI strings.
The actual XML parsing work is completed by the Handler class. We need to create our own Handler class, which requires us to implement the org.xml.sax.ContentHandler interface. This interface contains callback methods for receiving notifications when events occur, such as StartDocument, EndDocument, StartElement, EndElement, CharacterData, etc.

org.xml.sax.helpers.DefaultHandler Provides a default implementation of the ContentHandler interface, so we can inherit this class to implement our own processing class. Inheriting this class is a wise choice because we may only need to implement a few methods. Inheriting this class can ensure the simplicity and maintainability of the code.
The following is the XML document we want to parse:
employees.xml

<?xml version="1.0" encoding="UTF-8"?><Employees>
    <Employee id="1">
        <age>29</age>
        <name>Pankaj</name>
        <gender>Male</gender>
        <role>Java Developer</role>
    </Employee>
    <Employee id="2">
        <age>35</age>
        <name>Lisa</name>
        <gender>Female</gender>
        <role>CEO</role>
    </Employee>
    <Employee id="3">
        <age>40</age>
        <name>Tom</name>
        <gender>Male</gender>
        <role>Manager</role>
    </Employee>
    <Employee id="4">
        <age>25</age>
        <name>Meghna</name>
        <gender>Female</gender>
        <role>Manager</role>
    </Employee></Employees>

The content of this XML file stores some employee information. Each employee contains the id attribute and age, name, gender, and role fields. .
We will use the SAX parsing mechanism to process the XML file and create a list of employee objects.
We use the Employee class to abstract employee information: Employee.java

package com.journaldev.xml;public class Employee {
    private int id;    
    private String name;    
    private String gender;    
    private int age;    
    private String role;    
    public int getId() {        
    return id;
    }    public void setId(int id) {        
    this.id = id;
    }    public String getName() {        
    return name;
    }    public void setName(String name) {        
    this.name = name;
    }    public String getGender() {        
    return gender;
    }    public void setGender(String gender) {        
    this.gender = gender;
    }    public int getAge() {        
    return age;
    }    public void setAge(int age) {        
    this.age = age;
    }    public String getRole() {        
    return role;
    }    public void setRole(String role) {        
    this.role = role;
    }    @Override
    public String toString() {        
    return "Employee:: ID="+this.id+" Name=" + this.name + " Age=" + this.age + " Gender=" + this.gender +         
           " Role=" + this.role;
    }

}

Then inherit the DefaultHandler class to create your own Handler class MyHandler.java

package com.journaldev.xml.sax;
import java.util.ArrayList;
import java.util.List;
import org.xml.sax.Attributes;
import org.xml.sax.SAXException;
import org.xml.sax.helpers.DefaultHandler;
import com.journaldev.xml.Employee;
public class MyHandler extends DefaultHandler {

    //List to hold Employees object
    private List<Employee> empList = null;    
    private Employee emp = null;    
    //getter method for employee list
    public List<Employee> getEmpList() {        
    return empList;
    }    
    boolean bAge = false;    
    boolean bName = false;    
    boolean bGender = false;    
    boolean bRole = false;    
    @Override
    public void startElement(String uri, String localName, String qName, Attributes attributes)            
    throws SAXException {        
    if (qName.equalsIgnoreCase("Employee")) {            
    //create a new Employee and put it in Map
            String id = attributes.getValue("id");            
            //initialize Employee object and set id attribute
            emp = new Employee();
            emp.setId(Integer.parseInt(id));            
            //initialize list
            if (empList == null)
                empList = new ArrayList<>();
        } else if (qName.equalsIgnoreCase("name")) {            
        //set boolean values for fields, will be used in setting Employee variables
            bName = true;
        } else if (qName.equalsIgnoreCase("age")) {
            bAge = true;
        } else if (qName.equalsIgnoreCase("gender")) {
            bGender = true;
        } else if (qName.equalsIgnoreCase("role")) {
            bRole = true;
        }
    }    @Override
    public void endElement(String uri, String localName, String qName) throws SAXException {        
    if (qName.equalsIgnoreCase("Employee")) {            
    //add Employee object to list
            empList.add(emp);
        }
    }    @Override
    public void characters(char ch[], int start, int length) throws SAXException {        
    if (bAge) {            
    //age element, set Employee age
            emp.setAge(Integer.parseInt(new String(ch, start, length)));
            bAge = false;
        } else if (bName) {
            emp.setName(new String(ch, start, length));
            bName = false;
        } else if (bRole) {
            emp.setRole(new String(ch, start, length));
            bRole = false;
        } else if (bGender) {
            emp.setGender(new String(ch, start, length));
            bGender = false;
        }
    }
}

MyHandler类持有一个存放Employee对象的List引用,它只有一个对应的getter方法。Employee对象在事件处理函数中被添加到List对象,在MyHandler类中还定义了Employee对象和它的几个字段相关的boolean类型变量用于创建Employee对象,当Employee对象的所有属性都被设置时,它就会被添加到list中。
我们重写了几个重要的方法startElement(), endElement() 和characters().
当SAXParser 开始解析文档时遇到元素的开始标签时,startElement() 方法就会被调用,我们重写了这个方法,使用boolean类型变量来区分元素类别。我们也是在该方法中,当Employee 标签开始时创建Employee 对象。
当SAXParser遇到元素中的字符串数据时characters()方法会被调用,我们使用boolean类型字段为Employee对象的属性进行赋值。
endElement()方法则会在SAXParser 遇到XML结束标签时会被调用,在这里我们將Employee对象添加到List对象中。
在下面的测试程序中,我们使用MyHandler解析XML文档生成存放Employee 对象List。
XMLParserSAX.java

package com.journaldev.xml.sax;
import java.io.File;
import java.io.IOException;
import java.util.List;
import javax.xml.parsers.ParserConfigurationException;
import javax.xml.parsers.SAXParser;
import javax.xml.parsers.SAXParserFactory;
import org.xml.sax.SAXException;
import com.journaldev.xml.Employee;
public class XMLParserSAX {

    public static void main(String[] args) {
    SAXParserFactory saxParserFactory = SAXParserFactory.newInstance();
    try {
        SAXParser saxParser = saxParserFactory.newSAXParser();
        MyHandler handler = new MyHandler();
        saxParser.parse(new File("/Users/pankaj/employees.xml"), handler);
        //Get Employees list
        List<Employee> empList = handler.getEmpList();
        //print employee information
        for(Employee emp : empList)
            System.out.println(emp);
    } catch (ParserConfigurationException | SAXException | IOException e) {
        e.printStackTrace();
    }
    }

}

运行程序输出:

Employee:: ID=1 Name=Pankaj Age=29 Gender=Male Role=Java 
DeveloperEmployee:: ID=2 Name=Lisa Age=35 Gender=Female Role=CEOEmployee:: ID=3 Name=Tom Age=40 
Gender=Male Role=ManagerEmployee:: ID=4 Name=Meghna Age=25 Gender=Female Role=Manager

SAXParserFactory 类提供了工厂方法来获取SAXParser 实例,在调用 SAXParser对象的parse方法时传入Handler对象来处理回调事件。SAXParser解析机制刚开始接触时有点复杂,但是当你致力于处理大型的XML文档时,它比DOM解析提供了更有效的解析机制。

以上就是Java&Xml教程(五)使用SAX方式解析XML文件的内容,更多相关内容请关注PHP中文网(www.php.cn)!


Statement:
The content of this article is voluntarily contributed by netizens, and the copyright belongs to the original author. This site does not assume corresponding legal responsibility. If you find any content suspected of plagiarism or infringement, please contact admin@php.cn