开发者

How can I guess the charset of an html document?

Some malformed and incomplete HTML pages have no charset information assigned to them, and I have to figure out how to display them. Since there are dozens of encoding systems,开发者_C百科 I wonder if there is an algorithm I can use to correctly perform this task. Is there such thing?

Thanks!


Try jchardet or chsdet. Character set detection is probabilistic so it may go wrong in some cases, I have used jchardet with success few years back.

0

上一篇:

下一篇:

精彩评论

暂无评论...
验证码 换一张
取 消

最新问答

问答排行榜