开发者

How to parse encoding data from the rss feed?

i am pars开发者_高级运维ing rss feed.But i cantable to parse encoding data from thee rss feed.How to parse encoding data from the rss feed?


It's a rough task. feedparser (Python) does a number of things to try to appropriately guess the right character set. There are a few places where it can be provided -- such as the header of the XML and the header from the HTTP transaction (which overrides the header of the XML). If it's not there (or it's completely invalid which is quite common), it falls back to statistical guessing. There's one last technique -- try converting it as UTF-8 and if that fails, convert it from ISO-8859-1 to UTF-8 and try again. Good luck!

0

上一篇:

下一篇:

精彩评论

暂无评论...
验证码 换一张
取 消

最新问答

问答排行榜