7

Parsing XML with Python Minidom

 4 years ago
source link: https://rowelldionicio.com/parsing-xml-with-python-minidom/
Go to the source link to view the article. You can view the picture content, updated content and better typesetting reading experience. If the link is broken, please click the button below to view the snapshot at that time.
neoserver,ios ssh client

Parsing XML with Python Minidom

November 29, 2019 By Rowell Leave a Comment

A core skill for a DevNet associate is being knowledgeable of how to parse XML into a Python structure.

Watch this video on YouTube

Data can be retrieved from devices using different data structures such as XML, YAML, or JSON.

In this lab, I’ll look at parsing XML into a useable structure within Python.

First step is to use a Python script to send an HTTP request to our device so we can obtain data which will be returned in XML format.

Cisco provides us with a sandbox to test with using this Python script through their Coding 201 Parsing XML lab. I’ve had to modify it a little to ensure I can ignore the certificate verification. And I had to select another Sandbox due to authorization issues.

from urllib.request import Request, urlopen
import ssl 

req = Request('https://msesandbox.cisco.com/api/contextaware/v1/maps/info/DevNetCampus/DevNetBuilding/DevNetZone')
req.add_header('Authorization', 'Basic bGVhcm5pbmc6bGVhcm5pbmc==')

ctx = ssl.create_default_context()
ctx.check_hostname = False
ctx.verify_mode = ssl.CERT_NONE

r = urlopen(req, context=ctx)
rString = r.read().decode("utf-8")

print(rString)
r.close()

The following is the output from our UGLY request:

<?xml version="1.0" encoding="UTF-8" standalone="yes"?><Floor objectVersion="19" name="DevNetZone" isOutdoor="false" floorNumber="1" floorRefId="723413320329068590"><Dimension length="81.9" width="307.0" height="16.5" offsetX="0.0" offsetY="0.0" unit="FEET"/><Image imageName="domain_0_1421088463647.png"/><GPSMarker name="GPS_Marker_17"><GeoCoordinate latitude="36.125859" longitude="-97.066969" unit="DEGREES"/><MapCoordinate x="0.6" y="0.6" unit="FEET"/></GPSMarker><GPSMarker name="GPS_Marker_18"><GeoCoordinate latitude="36.125859" longitude="-97.06595" unit="DEGREES"/><MapCoordinate x="299.77" y="0.6" unit="FEET"/></GPSMarker><GPSMarker name="GPS_Marker_19"><GeoCoordinate latitude="36.125641" longitude="-97.066969" unit="DEGREES"/><MapCoordinate x="0.6" y="80.09" unit="FEET"/></GPSMarker><AccessPoint name="T1-3" radioMacAddress="00:2b:01:00:04:00" ethMacAddress="00:2b:01:00:04:f0" ipAddress="10.10.20.243" numOfSlots="2" apMode="LOCAL"><MapCoordinate x="155.28" y="57.57" unit="FEET"/><ApInterface band="IEEE_802_11_B" slotNumber="0" channelAssignment="1" channelNumber="1" txPowerLevel="1" antennaPattern="Internal-1140-2.4GHz" antennaAngle="1.57" 
**TRUNCATED**

As I’ve described in a previous post, XML is just another data structure. It’s commonly used in other network equipment such as Juniper and Palo Alto Networks.

In order to parse XML data we will need to import a library. There are two libraries we can use, but in this post we’ll look at Minidom:

Parsing XML with Minidom

Minidom is a Python library called Minimal DOM. It’s a minimal implementation of the Document Object Model interface.

To import the library just add the following to the top of the script:

import xml.dom.minidom

Now we modify our script so we can parse the returned XML data.

We’ll add the following lines:

xmlparse = xml.dom.minidom.parseString(rString)
prettyxml = xmlparse.toprettyxml()
print(prettyxml)

The parse() function in xml.dom.minidom.parseString(rString) to parse out the XML data and assign it to xmlparse.

Next, we use a DOM function to pass xmlparse into toprettyxml to make a pretty-printed version of the XML output we just saw.

The following output now looks more readable. We can see the root node, nests, etc.:

<?xml version="1.0" ?>
<Floor floorNumber="1" floorRefId="723413320329068590" isOutdoor="false" name="DevNetZone" objectVersion="19">
    <Dimension height="16.5" length="81.9" offsetX="0.0" offsetY="0.0" unit="FEET" width="307.0"/>
    <Image imageName="domain_0_1421088463647.png"/>
    <GPSMarker name="GPS_Marker_17">
        <GeoCoordinate latitude="36.125859" longitude="-97.066969" unit="DEGREES"/>
        <MapCoordinate unit="FEET" x="0.6" y="0.6"/>
    </GPSMarker>
    <GPSMarker name="GPS_Marker_18">
        <GeoCoordinate latitude="36.125859" longitude="-97.06595" unit="DEGREES"/>
        <MapCoordinate unit="FEET" x="299.77" y="0.6"/>
    </GPSMarker>
    <GPSMarker name="GPS_Marker_19">
        <GeoCoordinate latitude="36.125641" longitude="-97.066969" unit="DEGREES"/>
        <MapCoordinate unit="FEET" x="0.6" y="80.09"/>
    </GPSMarker>
    <AccessPoint apMode="LOCAL" ethMacAddress="00:2b:01:00:04:f0" ipAddress="10.10.20.243" name="T1-3" numOfSlots="2" radioMacAddress="00:2b:01:00:04:00">
        <MapCoordinate unit="FEET" x="155.28" y="57.57"/>
        <ApInterface antennaAngle="1.57" antennaElevAngle="0.0" antennaGain="0" antennaPattern="Internal-1140-2.4GHz" band="IEEE_802_11_B" channelAssignment="1" channelNumber="1" slotNumber="0" txPowerLevel="1"/>
        <ApInterface antennaAngle="1.57" antennaElevAngle="0.0" antennaGain="11" antennaPattern="Internal-1140-5.0GHz" band="IEEE_802_11_A" channelAssignment="1" channelNumber="64" slotNumber="1" txPowerLevel="5"/>
    </AccessPoint>
    <AccessPoint apMode="LOCAL" ethMacAddress="00:2b:01:00:05:f0" ipAddress="10.10.20.244" name="T1-4" numOfSlots="2" radioMacAddress="00:2b:01:00:05:00">
        <MapCoordinate unit="FEET" x="213.6" y="12.6"/>
        <ApInterface antennaAngle="1.57" antennaElevAngle="0.0" antennaGain="0" antennaPattern="Internal-1140-2.4GHz" band="IEEE_802_11_B" channelAssignment="1" channelNumber="1" slotNumber="0" txPowerLevel="1"/>
        <ApInterface antennaAngle="1.57" antennaElevAngle="0.0" antennaGain="11" antennaPattern="Internal-1140-5.0GHz" band="IEEE_802_11_A" channelAssignment="1" channelNumber="64" slotNumber="1" txPowerLevel="5"/>
    </AccessPoint>
    <AccessPoint apMode="LOCAL" ethMacAddress="00:2b:01:00:06:f0" ipAddress="10.10.20.245" name="T1-5" numOfSlots="2" radioMacAddress="00:2b:01:00:06:00">
        <MapCoordinate unit="FEET" x="253.7" y="58.48"/>
        <ApInterface antennaAngle="1.57" antennaElevAngle="0.0" antennaGain="0" antennaPattern="Internal-1140-2.4GHz" band="IEEE_802_11_B" channelAssignment="1" channelNumber="1" slotNumber="0" txPowerLevel="1"/>
        <ApInterface antennaAngle="1.57" antennaElevAngle="0.0" antennaGain="11" antennaPattern="Internal-1140-5.0GHz" band="IEEE_802_11_A" channelAssignment="1" channelNumber="64" slotNumber="1" txPowerLevel="5"/>
    </AccessPoint>
    <AccessPoint apMode="LOCAL" ethMacAddress="00:2b:01:00:03:f0" ipAddress="10.10.20.242" name="T1-2" numOfSlots="2" radioMacAddress="00:2b:01:00:03:00">
        <MapCoordinate unit="FEET" x="98.1" y="11.7"/>
        <ApInterface antennaAngle="1.57" antennaElevAngle="0.0" antennaGain="0" antennaPattern="Internal-1140-2.4GHz" band="IEEE_802_11_B" channelAssignment="1" channelNumber="1" slotNumber="0" txPowerLevel="1"/>
        <ApInterface antennaAngle="1.57" antennaElevAngle="0.0" antennaGain="11" antennaPattern="Internal-1140-5.0GHz" band="IEEE_802_11_A" channelAssignment="1" channelNumber="64" slotNumber="1" txPowerLevel="5"/>
    </AccessPoint>
    <AccessPoint apMode="LOCAL" ethMacAddress="00:2b:01:00:02:f0" ipAddress="10.10.20.241" name="T1-1" numOfSlots="2" radioMacAddress="00:2b:01:00:02:00">
        <MapCoordinate unit="FEET" x="43.9" y="57.88"/>
        <ApInterface antennaAngle="1.57" antennaElevAngle="0.0" antennaGain="0" antennaPattern="Internal-1140-2.4GHz" band="IEEE_802_11_B" channelAssignment="1" channelNumber="1" slotNumber="0" txPowerLevel="1"/>
        <ApInterface antennaAngle="1.57" antennaElevAngle="0.0" antennaGain="11" antennaPattern="Internal-1140-5.0GHz" band="IEEE_802_11_A" channelAssignment="1" channelNumber="64" slotNumber="1" txPowerLevel="5"/>
    </AccessPoint>
    <LocationFilterRegion regionType="OUTSIDE">
        <MapCoordinate unit="FEET" x="0.0" y="0.0"/>
        <MapCoordinate unit="FEET" x="307.0" y="0.0"/>
        <MapCoordinate unit="FEET" x="307.0" y="81.9"/>
        <MapCoordinate unit="FEET" x="0.0" y="81.9"/>
    </LocationFilterRegion>
</Floor>

Now that we have our request we need to identify what we’re looking for. I want to get information about access points.

I do see a sub-object, AccessPoint, which contains attributes of the access point element:

<AccessPoint apMode="LOCAL" ethMacAddress="00:2b:01:00:03:f0" ipAddress="10.10.20.242" name="T1-2" numOfSlots="2" radioMacAddress="00:2b:01:00:03:00"></AccessPoint>

Let’s grab information on each the access point’s elements of name, ethMacAddress, and ipAddress.

Time to parse through the XML data and get only what we need.

access_points = xmlparse.getElementsByTagName('AccessPoint')
for access_point in access_points:
    ap_name = access_point.getAttribute('name')
    ap_mac = access_point.getAttribute('ethMacAddress')
    ap_ip = access_point.getAttribute('ipAddress')
    print(access_point.tagName + ': ' + ap_name + '\t mac: ' + ap_mac + '\t ip: '+ ap_ip)

What does this do?

Let’s analyze what each line does 🤔

access_points = xmlparse.getElementsByTagName('AccessPoint') – With minidom it is possible to walk through each child node tree. That’s what we’re doing here with xmlparse.getElementsByTagName(‘AccessPoint’). We’re going to find each child of the name AccessPoint.

Next we’ll get into a for loop to cycle through any of the child nodes we’re looking for. In this case, AccessPoint.

Within the for loop there are three variables: ap_name, ap_mac, and ap_ip.

We’re going to use a Minidom element object, getAttribute, to return the value of the attribute named.

access_point.getAttribute(‘name’) – Through each child node with a tag of AccessPoint we want it to return the Name of that access point and assign that value in ap_name.

access_point.GetAttribute(‘ethMacAddress’) – We’re going to return the MAC Address of the access point under the tag of ethMacAddress, if it exists, and assign it to ap_mac.

access_point.getAttribute(‘ipAddress’) – The next attribute I want to collect is the IP address. If returned, it will be assigned to ap_ip.

Next, I want to visualize that information on the screen. With the print statement, print(access_point.tagName + ': ' + ap_name + '\t mac: ' + ap_mac + '\t ip: '+ ap_ip), we’re going to display some of the attributes we found.

The output is much more appealing now:

% python parsing_xml_with_python.py
AccessPoint: T1-3     mac: 00:2b:01:00:04:f0  ip: 10.10.20.243
AccessPoint: T1-4     mac: 00:2b:01:00:05:f0  ip: 10.10.20.244
AccessPoint: T1-5     mac: 00:2b:01:00:06:f0  ip: 10.10.20.245
AccessPoint: T1-2     mac: 00:2b:01:00:03:f0  ip: 10.10.20.242
AccessPoint: T1-1     mac: 00:2b:01:00:02:f0  ip: 10.10.20.241

Final Script

from urllib.request import Request, urlopen
import ssl
import xml.dom.minidom

req = Request('https://msesandbox.cisco.com/api/contextaware/v1/maps/info/DevNetCampus/DevNetBuilding/DevNetZone')
req.add_header('Authorization', 'Basic bGVhcm5pbmc6bGVhcm5pbmc==')

ctx = ssl.create_default_context()
ctx.check_hostname = False
ctx.verify_mode = ssl.CERT_NONE

r = urlopen(req, context=ctx)
rString = r.read().decode("utf-8")

#print(rString)

xmlparse = xml.dom.minidom.parseString(rString)
prettyxml = xmlparse.toprettyxml()
#print(prettyxml)

access_points = xmlparse.getElementsByTagName('AccessPoint')
for access_point in access_points:
	ap_name = access_point.getAttribute('name')
	ap_mac = access_point.getAttribute('ethMacAddress')
	ap_ip = access_point.getAttribute('ipAddress')
	print(access_point.tagName + ': ' + ap_name + '\t mac: ' + ap_mac + '\t ip: '+ ap_ip)

r.close()

Final Thoughts

I just went through parsing XML data using Python’s Minidom library. I thought it was straightforward with Python’s documentation clearly defining not just Minidom but also the Element Objects.

The Element Object of Element.getAttribute helps us narrow down all the XML data to just the data we need (parsing).

Now, I’d like to try to figure out how I can do this against my Cisco C9800-CL


Recommend

About Joyk


Aggregate valuable and interesting links.
Joyk means Joy of geeK