开发者

simplest way to return all of the text between a pair of html tags in BeatifulSoup

OK. I have a massive HTML file, and I only want the text that occurs between the tags

<center><span style="font-size: 144%;"></span></center>

and

<dl>  <dd><i></i&g开发者_开发知识库t;</dd>  </dl>

I am using Python2.6 and Beautifulsoup, but I have no idea where to begin. I'm assuming it's not difficult?


Try something like:

soup = BeautifulSoup.BeautifulSoup(YOUR_HTML)
texts = soup.findAll(text=True)
print texts
0

上一篇:

下一篇:

精彩评论

暂无评论...
验证码 换一张
取 消

最新问答

问答排行榜