BeautifulSoup: using strings to get a value
Is it possible to use a string to get a value of a tag?
XML structure:
book
title
titletext
book
title
titletext
Code:
books = BeautifulStoneSoup().findAll('book')
for book in books:
book.title.titletext.string
#book.get_by_string('title.titletext').string is this possible?
If it's not possible does getattr support multiple levels?
getattr(book, 'title.titletext').string
I did some testing and this doesn't seem t开发者_如何学Goo be possible but maybe there is an alternative?
If there isn't I guess I have to write my own recursive function to find the attribute?
I would suggest looking into ElementTree. It has what you need. As a quick example:
import xml.etree.cElementTree
doc = xml.etree.cElementTree.parse( filename )
for e in doc.getiterator( 'title' ):
book_title = e.attrib[ 'titletext' ]
Obviously I'm not handling error conditions, but using try/except or checking to see if 'titletext' is in the dict is sufficient.
If you are looking for a specific tag, and not an attribute of the tag, the above code will still work:
import xml.etree.cElementTree
doc = xml.etree.cElementTree.parse( filename )
for e in doc.getiterator( 'titletext' ):
book_title = e.text
In general, I've found ElementTree easier to work with than BeautifulSoup, at least for the kinds of things that I work with. I've found that it's slightly faster for our cases and it handles cases like yours more easily (in my opinion).
HTH.
精彩评论