开发者

Python: How to delete a HTML header from a text string? [duplicate]

This question already has answers here: 开发者_开发百科 Closed 11 years ago.

Possible Duplicate:

using python, Remove HTML tags/formatting from a string

I read in a HTML file:

fi = open("Tree.html", "r")
text = fi.read()

I want to delete the HTML header from the text:

text = re.sub("<head>.*?</head>", "", text)

Why does this not work?


It looks like you're not catching newlines. You need to add the DOTALL flag.

text = re.sub("<head>.*?</head>", "", text, flags=re.DOTALL)
0

上一篇:

下一篇:

精彩评论

暂无评论...
验证码 换一张
取 消

最新问答

问答排行榜