Modifying XML using Python programming language means updating or modifying an existing XML file or XML string using Python. Extensible Markup Language (XML) are the most widely used formats for data, because this format is very well supported by modern applications, and is very well suited for further data manipulation and customization. Therefore it is sometimes required to generate XML data using Python or other programming languages. We have seen how to parse or read an existing XML document and build or create a new XML document using Python. Here we will see how to update or modify an XML document using Python.
You may also like to read:
Have Python installed in Windows (or Unix)
Pyhton version and Packages
Here I am using Python 3.6.6 version
Example with Source Code
Preparing your workspace is one of the first things that you can do to make sure that you start off well. The first step is to check your working directory.
When you are working in the Python terminal, you need first navigate to the directory, where your file is located and then start up Python, i.e., you have to make sure that your file is located in the directory where you want to work from.
Let’s move on to the example…
In the below image you see I have opened a cmd prompt and navigated to the directory where I have to create Python script for modifying XML using Python.
Creating Python Script
Now we will create a python script that will read the attached XML file and modify any node value in the XML content and update the same XML file.
XML is an inherently hierarchical data format, and the most natural way to represent it is with a tree. We will be parsing the XML data using xml.etree.ElementTree. ElementTree represents the whole XML document as a tree, and Element represents a single node in this tree. Interactions with the whole document (reading and writing to/from files) are usually done on the ElementTree level. Interactions with a single XML element and its sub-elements are done on the Element level.
Here in the below Python XML builder script, we import the required module. Then we define a method that does the task of pretty printing of the XML structure otherwise all will be written in one line and it would a bit difficult to read the XMl file.
We want to update or modify the below XML document and in this document we have the root element called bookstore. Then we have other sub-elements under the root element called book and so on.
<?xml version='1.0' encoding='utf-8'?> <bookstore specialty="novel"> <book style="autobiography"> <author> <first-name>Joe</first-name> <last-name>Bob</last-name> <award>Trenton Literary Review Honorable Mention</award> </author> <price>12</price> </book> <magazine frequency="monthly" style="glossy"> <price>12</price> <subscription per="year" price="24" /> </magazine> </bookstore>
Finally we write the modified XML document into the file under the current directory where the Python script resides.
So let’s create the below script called xml_modifier.py.
import xml.etree.ElementTree as ET #pretty print method def indent(elem, level=0): i = "\n" + level*" " j = "\n" + (level-1)*" " if len(elem): if not elem.text or not elem.text.strip(): elem.text = i + " " if not elem.tail or not elem.tail.strip(): elem.tail = i for subelem in elem: indent(subelem, level+1) if not elem.tail or not elem.tail.strip(): elem.tail = j else: if level and (not elem.tail or not elem.tail.strip()): elem.tail = j return elem tree = ET.parse('bookstore2.xml') root = tree.getroot() #update all price for price in root.iter('price'): new_price = int(price.text) + 5 price.text = str(new_price) price.set('updated', 'yes') #write to file tree = ET.ElementTree(indent(root)) tree.write('bookstore2.xml', xml_declaration=True, encoding='utf-8')
Testing the Application
Now it’s time to test for the example on modifying XML using Python.
Simply run the above script you should see the modified bookstore2.xml file in the current directory. Here is the below output XML file.
<?xml version='1.0' encoding='utf-8'?> <bookstore specialty="novel"> <book style="autobiography"> <author> <first-name>Joe</first-name> <last-name>Bob</last-name> <award>Trenton Literary Review Honorable Mention</award> </author> <price updated="yes">17</price> </book> <magazine frequency="monthly" style="glossy"> <price updated="yes">17</price> <subscription per="year" price="24" /> </magazine> </bookstore>
That’s all. Hope, you got idea on modifying XML using Python.
Thanks for reading.