Home  >  Article  >  Backend Development  >  python parsing XML file example (picture)

python parsing XML file example (picture)

PHPz
PHPzOriginal
2017-04-23 16:44:513282browse

Use the xml.etree.ElementTree module as follows to parse XML files. The ElementTree module provides two classes to accomplish this purpose:

  • ElementTree represents the entire XML file (a tree structure)

  • Element represents an element (node) in the tree

We operate the following XML file: migapp.xml

python parsing XML file example (picture)

We can import the ElementTree module as follows: import xml.etree.ElementTree as ET

Or we can only import the parse parser: from xml.etree.ElementTree import parse

First you need to open an xml file. For local files, use the open function. If it is an Internet file, use urlopen:

f = open( ' migapp.xml ' , ' rt ' , encoding= ' utf -8 ' )

Then parse the XML.

1 Parse XML file

1.1 Parse the root element

tree = ET.parse(f)
root = tree.getroot()
print('root.tag =', root.tag)
print('root.attrib =', root.attrib)

python parsing XML file example (picture)

1.2 Parse the son of the root

for child in root:      # 仅可以解析出root的儿子,不能解析出root的子孙
    print(child.tag)
    print(child.attrib) # attrib is a dict

python parsing XML file example (picture)

1.3 Resolving descendants of the root by index

print(root[1][1].tag)
print(root[1][1].text)

python parsing XML file example (picture)

1.4 Iteratively parse out all specified elements

for element in root.iter('environment'):
    print(element.attrib)

python parsing XML file example (picture)

##1.5

Several useful methods

# element.findall()解析出指定element的所有儿子
# element.find()解析出指定element的第一个儿子
# element.get()解析出指定element的属性attrib
for environment in root.findall('environment'):
    first_variable = environment.find('variable')
    print(first_variable.get('name'))

python parsing XML file example (picture)

2 Modify the XML file

Suppose we need to add an attribute size="50" to each text element, modify its text to "Benxin Tuzi", and add a child element date="2016 Part of /01/16"

for text in root.iter('text'):
    text.set('size', '50')
    text.text = 'Benxin Tuzi'
    text.append(ET.Element('date', attrib={}, text='2016/01/16'))
tree.write('output.xml')

migapp.xml:

python parsing XML file example (picture)

##output.xml

in Corresponding part:

python parsing XML file example (picture)3 Notes

    Do not use xml.py as the file name, otherwise the following error will occur. :
  • ImportError: No module named 'xml.etree'; 'xml' is not a package

Analysis:

This is due to import It will first search in the current path, and then it is found that the xml.py module exists, and the xml.py we wrote ourselves is of course not a package

Note:

It still cannot be used after deleting xml.py Successfully explained, that is because xml.pyc is also generated in the current path, and the priority of this file is higher than xml.py, so the interpreter still looks for it in xml.pyc first, so this file must also be deleted. Solve the problem successfully.

Conclusion:

Try not to have the file name have the same name as the package name or module name, even if you do not use the module or package in the script, otherwise strange errors may occur.

    Many parsing functions provided in the ElementTree module need to read the entire XML document into memory in advance, which is not a good thing for large XML parsing, especially when we start from When reading XML in the network or pipeline, non-blocking parsing is very important. At this point, we can use the XMLPullParse class in the ElementTree module to handle it. Of course, we can also choose iterparse() of the ElementTree module instead. This method does not need to read all of the large XML into memory when parsing it.

The above is the detailed content of python parsing XML file example (picture). For more information, please follow other related articles on the PHP Chinese website!

Statement:
The content of this article is voluntarily contributed by netizens, and the copyright belongs to the original author. This site does not assume corresponding legal responsibility. If you find any content suspected of plagiarism or infringement, please contact admin@php.cn