

Parsing XML with Python Minidom
source link: https://rowelldionicio.com/parsing-xml-with-python-minidom/
Go to the source link to view the article. You can view the picture content, updated content and better typesetting reading experience. If the link is broken, please click the button below to view the snapshot at that time.

Parsing XML with Python Minidom
November 29, 2019 By Rowell Leave a Comment
A core skill for a DevNet associate is being knowledgeable of how to parse XML into a Python structure.
Data can be retrieved from devices using different data structures such as XML, YAML, or JSON.
In this lab, I’ll look at parsing XML into a useable structure within Python.
First step is to use a Python script to send an HTTP request to our device so we can obtain data which will be returned in XML format.
Cisco provides us with a sandbox to test with using this Python script through their Coding 201 Parsing XML lab. I’ve had to modify it a little to ensure I can ignore the certificate verification. And I had to select another Sandbox due to authorization issues.
from urllib.request import Request, urlopen
import ssl
req = Request('https://msesandbox.cisco.com/api/contextaware/v1/maps/info/DevNetCampus/DevNetBuilding/DevNetZone')
req.add_header('Authorization', 'Basic bGVhcm5pbmc6bGVhcm5pbmc==')
ctx = ssl.create_default_context()
ctx.check_hostname = False
ctx.verify_mode = ssl.CERT_NONE
r = urlopen(req, context=ctx)
rString = r.read().decode("utf-8")
print(rString)
r.close()
The following is the output from our UGLY request:
<?xml version="1.0" encoding="UTF-8" standalone="yes"?><Floor objectVersion="19" name="DevNetZone" isOutdoor="false" floorNumber="1" floorRefId="723413320329068590"><Dimension length="81.9" width="307.0" height="16.5" offsetX="0.0" offsetY="0.0" unit="FEET"/><Image imageName="domain_0_1421088463647.png"/><GPSMarker name="GPS_Marker_17"><GeoCoordinate latitude="36.125859" longitude="-97.066969" unit="DEGREES"/><MapCoordinate x="0.6" y="0.6" unit="FEET"/></GPSMarker><GPSMarker name="GPS_Marker_18"><GeoCoordinate latitude="36.125859" longitude="-97.06595" unit="DEGREES"/><MapCoordinate x="299.77" y="0.6" unit="FEET"/></GPSMarker><GPSMarker name="GPS_Marker_19"><GeoCoordinate latitude="36.125641" longitude="-97.066969" unit="DEGREES"/><MapCoordinate x="0.6" y="80.09" unit="FEET"/></GPSMarker><AccessPoint name="T1-3" radioMacAddress="00:2b:01:00:04:00" ethMacAddress="00:2b:01:00:04:f0" ipAddress="10.10.20.243" numOfSlots="2" apMode="LOCAL"><MapCoordinate x="155.28" y="57.57" unit="FEET"/><ApInterface band="IEEE_802_11_B" slotNumber="0" channelAssignment="1" channelNumber="1" txPowerLevel="1" antennaPattern="Internal-1140-2.4GHz" antennaAngle="1.57"
**TRUNCATED**
As I’ve described in a previous post, XML is just another data structure. It’s commonly used in other network equipment such as Juniper and Palo Alto Networks.
In order to parse XML data we will need to import a library. There are two libraries we can use, but in this post we’ll look at Minidom:
Parsing XML with Minidom
Minidom is a Python library called Minimal DOM. It’s a minimal implementation of the Document Object Model interface.
To import the library just add the following to the top of the script:
import xml.dom.minidom
Now we modify our script so we can parse the returned XML data.
We’ll add the following lines:
xmlparse = xml.dom.minidom.parseString(rString)
prettyxml = xmlparse.toprettyxml()
print(prettyxml)
The parse()
function in xml.dom.minidom.parseString(rString)
to parse out the XML data and assign it to xmlparse
.
Next, we use a DOM function to pass xmlparse
into toprettyxml
to make a pretty-printed version of the XML output we just saw.
The following output now looks more readable. We can see the root node, nests, etc.:
<?xml version="1.0" ?>
<Floor floorNumber="1" floorRefId="723413320329068590" isOutdoor="false" name="DevNetZone" objectVersion="19">
<Dimension height="16.5" length="81.9" offsetX="0.0" offsetY="0.0" unit="FEET" width="307.0"/>
<Image imageName="domain_0_1421088463647.png"/>
<GPSMarker name="GPS_Marker_17">
<GeoCoordinate latitude="36.125859" longitude="-97.066969" unit="DEGREES"/>
<MapCoordinate unit="FEET" x="0.6" y="0.6"/>
</GPSMarker>
<GPSMarker name="GPS_Marker_18">
<GeoCoordinate latitude="36.125859" longitude="-97.06595" unit="DEGREES"/>
<MapCoordinate unit="FEET" x="299.77" y="0.6"/>
</GPSMarker>
<GPSMarker name="GPS_Marker_19">
<GeoCoordinate latitude="36.125641" longitude="-97.066969" unit="DEGREES"/>
<MapCoordinate unit="FEET" x="0.6" y="80.09"/>
</GPSMarker>
<AccessPoint apMode="LOCAL" ethMacAddress="00:2b:01:00:04:f0" ipAddress="10.10.20.243" name="T1-3" numOfSlots="2" radioMacAddress="00:2b:01:00:04:00">
<MapCoordinate unit="FEET" x="155.28" y="57.57"/>
<ApInterface antennaAngle="1.57" antennaElevAngle="0.0" antennaGain="0" antennaPattern="Internal-1140-2.4GHz" band="IEEE_802_11_B" channelAssignment="1" channelNumber="1" slotNumber="0" txPowerLevel="1"/>
<ApInterface antennaAngle="1.57" antennaElevAngle="0.0" antennaGain="11" antennaPattern="Internal-1140-5.0GHz" band="IEEE_802_11_A" channelAssignment="1" channelNumber="64" slotNumber="1" txPowerLevel="5"/>
</AccessPoint>
<AccessPoint apMode="LOCAL" ethMacAddress="00:2b:01:00:05:f0" ipAddress="10.10.20.244" name="T1-4" numOfSlots="2" radioMacAddress="00:2b:01:00:05:00">
<MapCoordinate unit="FEET" x="213.6" y="12.6"/>
<ApInterface antennaAngle="1.57" antennaElevAngle="0.0" antennaGain="0" antennaPattern="Internal-1140-2.4GHz" band="IEEE_802_11_B" channelAssignment="1" channelNumber="1" slotNumber="0" txPowerLevel="1"/>
<ApInterface antennaAngle="1.57" antennaElevAngle="0.0" antennaGain="11" antennaPattern="Internal-1140-5.0GHz" band="IEEE_802_11_A" channelAssignment="1" channelNumber="64" slotNumber="1" txPowerLevel="5"/>
</AccessPoint>
<AccessPoint apMode="LOCAL" ethMacAddress="00:2b:01:00:06:f0" ipAddress="10.10.20.245" name="T1-5" numOfSlots="2" radioMacAddress="00:2b:01:00:06:00">
<MapCoordinate unit="FEET" x="253.7" y="58.48"/>
<ApInterface antennaAngle="1.57" antennaElevAngle="0.0" antennaGain="0" antennaPattern="Internal-1140-2.4GHz" band="IEEE_802_11_B" channelAssignment="1" channelNumber="1" slotNumber="0" txPowerLevel="1"/>
<ApInterface antennaAngle="1.57" antennaElevAngle="0.0" antennaGain="11" antennaPattern="Internal-1140-5.0GHz" band="IEEE_802_11_A" channelAssignment="1" channelNumber="64" slotNumber="1" txPowerLevel="5"/>
</AccessPoint>
<AccessPoint apMode="LOCAL" ethMacAddress="00:2b:01:00:03:f0" ipAddress="10.10.20.242" name="T1-2" numOfSlots="2" radioMacAddress="00:2b:01:00:03:00">
<MapCoordinate unit="FEET" x="98.1" y="11.7"/>
<ApInterface antennaAngle="1.57" antennaElevAngle="0.0" antennaGain="0" antennaPattern="Internal-1140-2.4GHz" band="IEEE_802_11_B" channelAssignment="1" channelNumber="1" slotNumber="0" txPowerLevel="1"/>
<ApInterface antennaAngle="1.57" antennaElevAngle="0.0" antennaGain="11" antennaPattern="Internal-1140-5.0GHz" band="IEEE_802_11_A" channelAssignment="1" channelNumber="64" slotNumber="1" txPowerLevel="5"/>
</AccessPoint>
<AccessPoint apMode="LOCAL" ethMacAddress="00:2b:01:00:02:f0" ipAddress="10.10.20.241" name="T1-1" numOfSlots="2" radioMacAddress="00:2b:01:00:02:00">
<MapCoordinate unit="FEET" x="43.9" y="57.88"/>
<ApInterface antennaAngle="1.57" antennaElevAngle="0.0" antennaGain="0" antennaPattern="Internal-1140-2.4GHz" band="IEEE_802_11_B" channelAssignment="1" channelNumber="1" slotNumber="0" txPowerLevel="1"/>
<ApInterface antennaAngle="1.57" antennaElevAngle="0.0" antennaGain="11" antennaPattern="Internal-1140-5.0GHz" band="IEEE_802_11_A" channelAssignment="1" channelNumber="64" slotNumber="1" txPowerLevel="5"/>
</AccessPoint>
<LocationFilterRegion regionType="OUTSIDE">
<MapCoordinate unit="FEET" x="0.0" y="0.0"/>
<MapCoordinate unit="FEET" x="307.0" y="0.0"/>
<MapCoordinate unit="FEET" x="307.0" y="81.9"/>
<MapCoordinate unit="FEET" x="0.0" y="81.9"/>
</LocationFilterRegion>
</Floor>
Now that we have our request we need to identify what we’re looking for. I want to get information about access points.
I do see a sub-object, AccessPoint
, which contains attributes of the access point element:
<AccessPoint apMode="LOCAL" ethMacAddress="00:2b:01:00:03:f0" ipAddress="10.10.20.242" name="T1-2" numOfSlots="2" radioMacAddress="00:2b:01:00:03:00"></AccessPoint>
Let’s grab information on each the access point’s elements of name
, ethMacAddress
, and ipAddress
.
Time to parse through the XML data and get only what we need.
access_points = xmlparse.getElementsByTagName('AccessPoint')
for access_point in access_points:
ap_name = access_point.getAttribute('name')
ap_mac = access_point.getAttribute('ethMacAddress')
ap_ip = access_point.getAttribute('ipAddress')
print(access_point.tagName + ': ' + ap_name + '\t mac: ' + ap_mac + '\t ip: '+ ap_ip)
What does this do?
Let’s analyze what each line does
access_points = xmlparse.getElementsByTagName('AccessPoint')
– With minidom it is possible to walk through each child node tree. That’s what we’re doing here with xmlparse.getElementsByTagName(‘AccessPoint’)
. We’re going to find each child of the name AccessPoint
.
Next we’ll get into a for
loop to cycle through any of the child nodes we’re looking for. In this case, AccessPoint
.
Within the for
loop there are three variables: ap_name
, ap_mac
, and ap_ip
.
We’re going to use a Minidom element object, getAttribute
, to return the value of the attribute named.
access_point.getAttribute(‘name’)
– Through each child node with a tag of AccessPoint
we want it to return the Name of that access point and assign that value in ap_name
.
access_point.GetAttribute(‘ethMacAddress’)
– We’re going to return the MAC Address of the access point under the tag of ethMacAddress
, if it exists, and assign it to ap_mac
.
access_point.getAttribute(‘ipAddress’)
– The next attribute I want to collect is the IP address. If returned, it will be assigned to ap_ip
.
Next, I want to visualize that information on the screen. With the print
statement, print(access_point.tagName + ': ' + ap_name + '\t mac: ' + ap_mac + '\t ip: '+ ap_ip)
, we’re going to display some of the attributes we found.
The output is much more appealing now:
% python parsing_xml_with_python.py
AccessPoint: T1-3 mac: 00:2b:01:00:04:f0 ip: 10.10.20.243
AccessPoint: T1-4 mac: 00:2b:01:00:05:f0 ip: 10.10.20.244
AccessPoint: T1-5 mac: 00:2b:01:00:06:f0 ip: 10.10.20.245
AccessPoint: T1-2 mac: 00:2b:01:00:03:f0 ip: 10.10.20.242
AccessPoint: T1-1 mac: 00:2b:01:00:02:f0 ip: 10.10.20.241
Final Script
from urllib.request import Request, urlopen
import ssl
import xml.dom.minidom
req = Request('https://msesandbox.cisco.com/api/contextaware/v1/maps/info/DevNetCampus/DevNetBuilding/DevNetZone')
req.add_header('Authorization', 'Basic bGVhcm5pbmc6bGVhcm5pbmc==')
ctx = ssl.create_default_context()
ctx.check_hostname = False
ctx.verify_mode = ssl.CERT_NONE
r = urlopen(req, context=ctx)
rString = r.read().decode("utf-8")
#print(rString)
xmlparse = xml.dom.minidom.parseString(rString)
prettyxml = xmlparse.toprettyxml()
#print(prettyxml)
access_points = xmlparse.getElementsByTagName('AccessPoint')
for access_point in access_points:
ap_name = access_point.getAttribute('name')
ap_mac = access_point.getAttribute('ethMacAddress')
ap_ip = access_point.getAttribute('ipAddress')
print(access_point.tagName + ': ' + ap_name + '\t mac: ' + ap_mac + '\t ip: '+ ap_ip)
r.close()
Final Thoughts
I just went through parsing XML data using Python’s Minidom library. I thought it was straightforward with Python’s documentation clearly defining not just Minidom but also the Element Objects.
The Element Object of Element.getAttribute
helps us narrow down all the XML data to just the data we need (parsing).
Now, I’d like to try to figure out how I can do this against my Cisco C9800-CL
Recommend
-
37
1.问题描述属性无序问题和xml声明不是单独一行# cat HKEX-EPS_20180830_003249795.xml<?xml version="1.0" encoding="UTF-8"?><ETCML><IISHeadline><News Encoding="UTF-8" Language="en-us" TimeStamp="201808301
-
29
Preface XML is a standardized markup language that defines a set of rules for encoding hierarchically structured documents in a human-readable text-based format. XML is in widespread...
-
10
Parsing XML in JavaScript? Monday, January 9, 2006 I have been doing some work with JavaScript / Ajax lately, and found myself needing to parse some XML (something I do quite often when building apps). However, I have n...
-
13
用Python的minidom写XML 2017-11-22 15:29:30 +08 字数:1543 标签: Python 读、或者说解析XML的需求很常见,而写、或者说生成XML...
-
24
Org.xml.sax.SAXParseException when parsing XMl using XPATH advertisements I am trying to get values from an XML using XPATH. I received the fo...
-
11
Blog: XML parsing in Rust Last week we spent some time researching the current state of
-
6
XML Parsing in Lint: Things Are Not What They Seem 🦹♀️ About a year ago, I wrote about including quickfixes for Lint rul...
-
7
Souce Code Run code on Go Playground parse-6.go |
-
13
[Golang] XML Parsing Example (3) February 21, 2015 In this exmaple, we will parse a
-
17
PARSE XML/HTML FROM A FILE This post gives a real-world example about how to parse and retrieve data from a XML/HTML file by the use...
About Joyk
Aggregate valuable and interesting links.
Joyk means Joy of geeK