开发者

UnicodeDecodeError in Python with codecs module

I have a text file which comprises unicode strings "aBiyukÙwa", "varcasÙva" etc. When I try to开发者_开发问答 decode them in the python interpreter using the following code, it works fine and decodes to u'aBiyuk\xd9wa':

"aBiyukÙwa".decode("utf-8")

But when I read it from a file in a python program using the codecs module in the following code it throws a UnicodeDecodeError.

file = codecs.open('/home/abehl/TokenOutput.wx', 'r', 'utf-8')
for row in file:

Following is the error message:

UnicodeDecodeError: 'utf8' codec can't decode byte 0xd9 in position 8: invalid continuation byte

Any ideas what is causing this strange behavior?


Your file is not encoded in UTF-8. Find out what it is encoded in, and then use that.

0

上一篇:

下一篇:

精彩评论

暂无评论...
验证码 换一张
取 消

最新问答

问答排行榜