开发者

help with UnicodeEncodeError('ascii', u'Phase \u2013 II', 6, 7, 'ordinal not in range(128)')

having problem with UnicodeEncodeError('ascii', u'Phase \u2013 II', 6, 7, 'ordinal not in range(128)') Basically what I am doing here is reading the value from excel sheet and sheet contain address in this format

Phase开发者_高级运维- II

So wanted to know how to change`

somestring = u'Phase \u2013 II'

to str

thanks

`


Excel mostly uses cp1252, so try this:

>>> somestring.encode('cp1252', 'replace')
'Phase \x96 II'
>>> print somestring.encode('cp1252', 'replace')
Phase – II

That doesn't give you an ascii string (since your unicode string contains non-ascii characters it cannot), but it does give you a byte string that Excel will interpret correctly if for example you write it into a csv file.

If you just want to print it for display then you'll need to know the output encoding of whatever you use to display the text: I copied the example from idle which will, at least on my system displays cp1252, but if you print it in a command prompt you may have another encoding in effect. Use the DOS chcp command to select an appropriate encoding if required as the default encoding may not support that character:

C:\>chcp
Active code page: 850

C:\>\python26\python
Python 2.6.2 (r262:71605, Apr 14 2009, 22:40:02) [MSC v.1500 32 bit (Intel)] on win32
Type "help", "copyright", "credits" or "license" for more information.
>>> somestring = u'Phase \u2013 II'
>>> print somestring.encode('cp850', 'replace')
Phase ? II
>>>

Using the 'replace' argument to encode means that if you do manage to get any characters that cannot be interpreted as cp1252 will be replaced by question marks.

0

上一篇:

下一篇:

精彩评论

暂无评论...
验证码 换一张
取 消

最新问答

问答排行榜