How can I get the only element of a certain type out of a list more cleanly than what I am doing?
I am working with some xml files. The schema for the files specifies that there can only be one of a certain type of element (in this case I am working with the footnotes element).
There can be several footnote elements in the footnotes element, I am trying to grab and process the footnotes element so that I can iterate through it to discover the footnote elements.
here is my current approach
def get_footnotes(element_list):
footnoteDict=od()
footnotes_element=[item for item in element_list if item.tag=='footnotes'][0]
for eachFootnote in footnotes_element.iter():
if eachFootnote.tag=='footnote':
footnoteDict[eachFootnote.values()[0]]=eachFootnote.text
return footnoteDict开发者_开发百科
element_list is a list of elements that are relevant for me after iterating through the entire tree
So I am wondering if there is a more pythonic way to get the footnotes element instead of iterating through the list of elements it seems to me that this is clumsy with this being
footnotes_element=[item for item in element_list if item.tag=='footnotes'][0]
Something like this should do the job:
from lxml import etree
xmltree = etree.fromstring(your_xml)
for footnote in xmltree.iterfind("//footnotes/footnote"):
# do something
pass
It's easier to help if you provide some sample XML.
Edit:
If you are working with really big files, you might want to look into iterparse
.
This question seems to have quite a nice example: python's lxml and iterparse method
精彩评论