开发者

Beautiful Soup - identify tag based on position next to comment

I'm using Beautiful Soup.

Is there any way that I can get hold of a tag bas开发者_如何学Goed on its position next to a comment (something not included in the parse tree)?

For example, let's say I have...

<html>
<body>
<p>paragraph 1</p>
<p>paragraph 2</p>
<!--text-->
<p>paragraph 3</p>
</body>
</html>

In this example, how might I identify <p>paragraph 2</p> given that I'm searching for the comment "<!--text-->" ?

Thanks for any help.


Comments appear in the BeautifulSoup parse tree like any other node. For example, to find the comment with the text some comment text and then print out the previous <p> element you could do:

from BeautifulSoup import BeautifulSoup, Comment

soup = BeautifulSoup('''<html>
<body>
<p>paragraph 1</p>
<p>paragraph 2</p>
<!--some comment text-->
<p>paragraph 3</p>
</body>
</html>''')

def right_comment(e):
    return isinstance(e, Comment) and e == 'some comment text'

e = soup.find(text=right_comment)

print e.findPreviousSibling('p')

... that will print out:

<p>paragraph 2</p>
0

上一篇:

下一篇:

精彩评论

暂无评论...
验证码 换一张
取 消

最新问答

问答排行榜