开发者

BeautifulSoup: using strings to get a value

Is it possible to use a string to get a value of a tag?

XML structure:

book
   title
      titletext
book
   title
      titletext

Code:

books = BeautifulStoneSoup().findAll('book')
for book in books:
    book.title.titletext.string
    #book.get_by_string('title.titletext').string is this possible?

If it's not possible does getattr support multiple levels?

getattr(book, 'title.titletext').string

I did some testing and this doesn't seem t开发者_如何学Goo be possible but maybe there is an alternative?

If there isn't I guess I have to write my own recursive function to find the attribute?


I would suggest looking into ElementTree. It has what you need. As a quick example:

import xml.etree.cElementTree

doc = xml.etree.cElementTree.parse( filename )
for e in doc.getiterator( 'title' ):
    book_title = e.attrib[ 'titletext' ]

Obviously I'm not handling error conditions, but using try/except or checking to see if 'titletext' is in the dict is sufficient.

If you are looking for a specific tag, and not an attribute of the tag, the above code will still work:

import xml.etree.cElementTree

doc = xml.etree.cElementTree.parse( filename )
for e in doc.getiterator( 'titletext' ):
    book_title = e.text

In general, I've found ElementTree easier to work with than BeautifulSoup, at least for the kinds of things that I work with. I've found that it's slightly faster for our cases and it handles cases like yours more easily (in my opinion).

HTH.

0

上一篇:

下一篇:

精彩评论

暂无评论...
验证码 换一张
取 消

最新问答

问答排行榜