BeautifulSoup question
<parent1>
<span>Text1</span>
</parnet1>
<parent2>
<span>Text2</span>
</parnet2>
<parent3>
<span>Text3</span>
</parnet3>
I'm parsing this with Python & BeautifulSoup. I have a variable soupData which stores pointer for need object. How can I get pointer for the parent2, for example, if I have the text 开发者_如何学JAVAText2. So the problem is to filter span-tags by content. How can I do this?
After correcting the spelling on the end-tags:
[e for e in soup(recursive=False, text=False) if e.span.string == 'Text2']
I don't think there's a way to do it in a single step. So:
for parenttag in soupData:
if parenttag.span.string == "Text2":
do_stuff(parenttag)
break
It's possible to use a generator expression, but not much shorter.
Using python 2.7.6 and BeautifulSoup 4.3.2 I found Marcelo's answer to give an empty list. This worked for me, however:
[x.parent for x in bSoup.findAll('span') if x.text == 'Text2'][0]
Alternatively, for a ridiculously overengineered solution (to this particular problem at least, but maybe it would be useful if you'll be doing filtering on criteria too long to put in a reasonably easily understandable list expression) you could do:
def hasText(text):
def hasTextFunc(x):
return x.text == text
return hasTextFunc
to create a function factory, then
hasTextText2 = hasText('Text2')
filter(hasTextText2,bSoup.findAll('span'))[0].parent
to get the reference to the parent tag that you were looking for
精彩评论