Java&Xml Tutorial (5) Using SAX to parse XML files

The Java SAX parsing mechanism provides us with a series of APIs to process XML files. The SAX parsing method is different from the DOM parsing method. It does not load the entire content of the XML file at once, but loads it in parts continuously.

javax.xml.parsers.SAXParserThe class provides some functions and uses event processing to parse XML documents. This class implements the XMLReader interface and provides overloaded parse( ) method reads XML documents from File, InputStream, SAX InputSource and URI strings.
The actual XML parsing work is completed by the Handler class. We need to create our own Handler class, which requires us to implement the org.xml.sax.ContentHandler interface. This interface contains callback methods for receiving notifications when events occur, such as StartDocument, EndDocument, StartElement, EndElement, CharacterData, etc.

org.xml.sax.helpers.DefaultHandler Provides a default implementation of the ContentHandler interface, so we can inherit this class to implement our own processing class. Inheriting this class is a wise choice because we may only need to implement a few methods. Inheriting this class can ensure the simplicity and maintainability of the code.
The following is the XML document we want to parse:

<?xml version="1.0" encoding="UTF-8"?><Employees>
    <Employee id="1">
        <role>Java Developer</role>
    <Employee id="2">
    <Employee id="3">
    <Employee id="4">

The content of this XML file stores some employee information. Each employee contains the id attribute and age, name, gender, and role fields. .
We will use the SAX parsing mechanism to process the XML file and create a list of employee objects.
We use the Employee class to abstract employee information: Employee.java

package com.journaldev.xml;public class Employee {
    private int id;    
    private String name;    
    private String gender;    
    private int age;    
    private String role;    
    public int getId() {        
    return id;
    }    public void setId(int id) {        
    this.id = id;
    }    public String getName() {        
    return name;
    }    public void setName(String name) {        
    this.name = name;
    }    public String getGender() {        
    return gender;
    }    public void setGender(String gender) {        
    this.gender = gender;
    }    public int getAge() {        
    return age;
    }    public void setAge(int age) {        
    this.age = age;
    }    public String getRole() {        
    return role;
    }    public void setRole(String role) {        
    this.role = role;
    }    @Override
    public String toString() {        
    return "Employee:: ID="+this.id+" Name=" + this.name + " Age=" + this.age + " Gender=" + this.gender +       
             " Role=" + this.role;


Then inherit the DefaultHandler class to create your own Handler class MyHandler.java

package com.journaldev.xml.sax;
import java.util.ArrayList;
import java.util.List;
import org.xml.sax.Attributes;
import org.xml.sax.SAXException;
import org.xml.sax.helpers.DefaultHandler;
import com.journaldev.xml.Employee;
public class MyHandler extends DefaultHandler {

    //List to hold Employees object
    private List<Employee> empList = null;    
    private Employee emp = null;    
    //getter method for employee list
    public List<Employee> getEmpList() {        
    return empList;
    boolean bAge = false;    
    boolean bName = false;    
    boolean bGender = false;    
    boolean bRole = false;    
    public void startElement(String uri, String localName, String qName, Attributes attributes)            
    throws SAXException {        
    if (qName.equalsIgnoreCase("Employee")) {            
    //create a new Employee and put it in Map
            String id = attributes.getValue("id");            
            //initialize Employee object and set id attribute
            emp = new Employee();
            //initialize list
            if (empList == null)
                empList = new ArrayList<>();
        } else if (qName.equalsIgnoreCase("name")) {            
        //set boolean values for fields, will be used in setting Employee variables
            bName = true;
        } else if (qName.equalsIgnoreCase("age")) {
            bAge = true;
        } else if (qName.equalsIgnoreCase("gender")) {
            bGender = true;
        } else if (qName.equalsIgnoreCase("role")) {
            bRole = true;
    }    @Override
    public void endElement(String uri, String localName, String qName) throws SAXException {        
    if (qName.equalsIgnoreCase("Employee")) {            
    //add Employee object to list
    }    @Override
    public void characters(char ch[], int start, int length) throws SAXException {        
    if (bAge) {            
    //age element, set Employee age
            emp.setAge(Integer.parseInt(new String(ch, start, length)));
            bAge = false;
        } else if (bName) {
            emp.setName(new String(ch, start, length));
            bName = false;
        } else if (bRole) {
            emp.setRole(new String(ch, start, length));
            bRole = false;
        } else if (bGender) {
            emp.setGender(new String(ch, start, length));
            bGender = false;

The MyHandler class holds a storage for Employee objects List reference, which has only one corresponding getter method. The Employee object is added to the List object in the event handling function. The Boolean type variables related to the Employee object and several of its fields are also defined in the MyHandler class to create the Employee object. When all properties of the Employee object are set, It will be added to the list.
We have rewritten several important methods startElement(), endElement() and characters().
When SAXParser starts parsing the document and encounters the start tag of the element, the startElement() method will be called. We override this method and use boolean type variables to distinguish element categories. It is also in this method that we create the Employee object when the Employee tag starts.
The characters() method will be called when SAXParser encounters string data in an element. We use boolean type fields to assign values ​​to the properties of the Employee object. The
endElement() method will be called when SAXParser encounters the XML end tag. Here we add the Employee object to the List object.
In the following test program, we use MyHandler to parse the XML document to generate a List of Employee objects.

package com.journaldev.xml.sax;
import java.io.File;
import java.io.IOException;
import java.util.List;
import javax.xml.parsers.ParserConfigurationException;
import javax.xml.parsers.SAXParser;
import javax.xml.parsers.SAXParserFactory;
import org.xml.sax.SAXException;
import com.journaldev.xml.Employee;
public class XMLParserSAX {

    public static void main(String[] args) {
    SAXParserFactory saxParserFactory = SAXParserFactory.newInstance();
    try {
        SAXParser saxParser = saxParserFactory.newSAXParser();
        MyHandler handler = new MyHandler();
        saxParser.parse(new File("/Users/pankaj/employees.xml"), handler);
        //Get Employees list
        List<Employee> empList = handler.getEmpList();
        //print employee information
        for(Employee emp : empList)
    } catch (ParserConfigurationException | SAXException | IOException e) {


Running program output:

Employee:: ID=1 Name=Pankaj Age=29 Gender=Male Role=Java DeveloperEmployee:: ID=2 Name=Lisa Age=35 
Gender=Female Role=CEOEmployee:: ID=3 Name=Tom Age=40 
Gender=Male Role=ManagerEmployee:: ID=4 Name=Meghna Age=25 Gender=Female Role=Manager

The SAXParserFactory class provides a factory method to obtain a SAXParser instance. When calling the parse method of the SAXParser object, a Handler object is passed in to handle the callback. event. The SAXParser parsing mechanism is a bit complex at first, but when you are committed to processing large XML documents, it provides a more efficient parsing mechanism than DOM parsing.
