Python STILL won't allow Japanese Characters despite specifying the encoding
#!/usr/bin/env python
# -*- coding: utf8 -*-
print "私"
print u"私"
the result:
ç§
UnicodeEncodeError: 'ascii' codec can't encode character u'\u79c1' in position 0: or开发者_StackOverflowdinal not in range(128)
Or, in Idle for both u"私" and "私":
>>> print "私"
Unsupported characters in input
I've followed all the advice I could find which says that I have to put the "coding" line under the shebang. All my web-browsers display kanji fine, and I can type it fine. But this garble comes out when I try and use it in Python :( Any ideas?
You specified the encoding of the source file and supposedly saved the files using UTF-8.
Still your stdout
is using ascii
so it is normal to fail.
You have an encoding issue not a decoding issue, Python does read your Unicode characters just fine, probably will be able to save them inside a file if you choose right encoding.
Still, stdout is not always Unicode compatible, especially on Windows.
You could do something like this: sys.stdout.write(strin.encode(utf-8))
and you will not get an error, but this does not mean that you will see the characters on the screen.
You need a terminal or IDE that supports UTF8, or at least an encoding that supports Japanese. PythonWin, from the Pywin32 extension library, is an IDE that will work.
Try this:
#!/usr/bin/env python
# -*- coding: utf8 -*-
print unicode("私","UTF-8")
sorin's answer is correct. There's another question which covers the same ground: Setting the correct encoding when piping stdout in Python
Python is applying a default encoding when it writes the output, and this encoding is not UTF-8.
The error from IDLE is because IDLE interprets input according to the system locale. Windows does not provide a locale that accepts UTF-8 input, so the default does not accept arbitrary Unicode. You may change the default with the simple instructions in this answer. You'll still get the incorrect output without reencoding it.
精彩评论