how do I filter values from XML file in python
I have a basic grasp of XML and python and have been using minidom with some success. I have run into a situation where I am unable to get the values I want from an XML file. Here is the basic structure of the pre-existing file.
<localization>
<b n="Stats">
<l k="SomeStat1">
<v>10</v>
</l>
<l k="SomeStat2">
<v>6</v>
</l>
</b>
<b n="Levels">
<l k="Level1">
<v>Beginner Level</v>
</l>
<l k="Level2">
<v>Intermediate Level</v>
</l>
</b>
</localization>
There are about 15 different <b>
tags with dozens of children. What I'd like to do is, if given a level number(1), is find the <v>
node for the corresponding leve开发者_开发百科l. I just have no idea how to go about this.
You might consider using XPATH, a language for addressing parts of an xml document.
Here's the answer using lxml.etree
and it's support for xpath
.
>>> data = """
... <localization>
... <b n="Stats">
... <l k="SomeStat1">
... <v>10</v>
... </l>
... <l k="SomeStat2">
... <v>6</v>
... </l>
... </b>
... <b n="Levels">
... <l k="Level1">
... <v>Beginner Level</v>
... </l>
... <l k="Level2">
... <v>Intermediate Level</v>
... </l>
... </b>
... </localization>
... """
>>>
>>> from lxml import etree
>>>
>>> xmldata = etree.XML(data)
>>> xmldata.xpath('/localization/b[@n="Levels"]/l[@k=$level]/v/text()',level='Level1')
['Beginner Level']
#!/usr/bin/python
from xml.dom.minidom import parseString
xml = parseString("""<localization>
<b n="Stats">
<l k="SomeStat1">
<v>10</v>
</l>
<l k="SomeStat2">
<v>6</v>
</l>
</b>
<b n="Levels">
<l k="Level1">
<v>Beginner Level</v>
</l>
<l k="Level2">
<v>Intermediate Level</v>
</l>
</b>
</localization>""")
level = 1
blist = xml.getElementsByTagName('b')
for b in blist:
if b.getAttribute('n') == 'Levels':
llist = b.getElementsByTagName('l')
l = llist.item(level)
v = l.getElementsByTagName('v')
print v.item(0).firstChild.nodeValue;
#prints Intermediate Level
If you could use BeautifulSoup library (couldn't you?) you could end up with this dead-simple code:
from BeautifulSoup import BeautifulStoneSoup
def get_it(xml, level_n):
soup = BeautifulStoneSoup(xml)
l = soup.find('l', k="Level%d" % level_n)
return l.v.string
if __name__ == '__main__':
print get_it(1)
It prints Beginner Level
for the example XML you provided.
If you really only care about searching for an <l>
tag with a specific "k" attribute and then getting its <v>
tag (that's how I understood your question), you could do it with DOM:
from xml.dom.minidom import parseString
xmlDoc = parseString("""<document goes here>""")
lNodesWithLevel2 = [lNode for lNode in xmlDoc.getElementsByTagName("l")
if lNode.getAttribute("k") == "Level2"]
matchingVNodes = map(lambda lNode: lNode.getElementsByTagName("v"), lNodesWithLevel2)
print map(lambda vNode: vNode.firstChild.nodeValue, matchingVNodes)
# Prints [u'Intermediate Level']
How that is what you meant.
level = "Level"+raw_input("Enter level number: ")
content= open("xmlfile").read()
data= content.split("</localization>")
for item in data:
if "localization" in item:
s = item.split("</b>")
for i in s:
if """<b n="Levels">""" in i:
for c in i.split("</l>"):
if "<l" in c and level in c:
for v in c.split("</v>"):
if "<v>" in v:
print v[v.index("<v>")+3:]
精彩评论