开发者

How can I get the only element of a certain type out of a list more cleanly than what I am doing?

I am working with some xml files. The schema for the files specifies that there can only be one of a certain type of element (in this case I am working with the footnotes element).

There can be several footnote elements in the footnotes element, I am trying to grab and process the footnotes element so that I can iterate through it to discover the footnote elements.

here is my current approach

def get_footnotes(element_list):
    footnoteDict=od()

    footnotes_element=[item for item in element_list if item.tag=='footnotes'][0]
    for eachFootnote in footnotes_element.iter():
        if eachFootnote.tag=='footnote':
            footnoteDict[eachFootnote.values()[0]]=eachFootnote.text
    return footnoteDict开发者_开发百科

element_list is a list of elements that are relevant for me after iterating through the entire tree

So I am wondering if there is a more pythonic way to get the footnotes element instead of iterating through the list of elements it seems to me that this is clumsy with this being

footnotes_element=[item for item in element_list if item.tag=='footnotes'][0]


Something like this should do the job:

from lxml import etree

xmltree = etree.fromstring(your_xml)

for footnote in xmltree.iterfind("//footnotes/footnote"):
   # do something
   pass

It's easier to help if you provide some sample XML.

Edit:

If you are working with really big files, you might want to look into iterparse.

This question seems to have quite a nice example: python's lxml and iterparse method

0

上一篇:

下一篇:

精彩评论

暂无评论...
验证码 换一张
取 消

最新问答

问答排行榜