Why is os.path.expanduser misbehaving when the home directory has special characters?
Currently, my user directory is located in "C:\Users\João", and I'm running the Python 2.7's 64-bit build under Windows 7.
Normally, the Python interpreter runs with 'ascii' as the default encoding. However, for some reason, when Eclipse runs it, the default encoding is 'utf-8'. Now, in the regular Python console, the following happens:
>>> sys.getdefaultencoding()
'ascii'
>>> os.path.expanduser('~/filename')
'C:\\Users\\Jo\xe3o/filename'
>>> x = open(_, 'w')
>>> x.close()
>>>
I'll note that '\xe3' is the code for 'ã' in both Latin-1 and Windows-1252, and that everything goes fine. However, in Eclipse,
>>> sys.getdefaultencoding()
'utf-8'
>>> os.path.expanduser('~/filename')
'C:\\Users\\Jo\xc6o/filename'
>>> x = open(_, 'w')
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
IOError: [Errno 13] Permission denied: 'C:\\Users\\Jo\xc6o/filename'
Which is confusing, since '\xc6' is the character code for 'Æ', and on top of that, isn't valid UTF-8.
If you're wondering about "Permission denied", instead of "No such file or directory", a couple of programs have also written stuff to 'C:\Users\JoÆo', and I also have no idea why.
So what is开发者_Go百科 the cause of, and the solution for this? Is it even programatic or do you think it's some system setting that might be wrong?
TL; DR: Home directory is correctly reported as 'C:\Users\João' in the standard Python interpreter, and as 'C:\Users\JoÆo' when the interpreter is running in Eclipse. Why?
Try changing Eclipse's default encoding. File Menu: Windows --> Preferences; Tree Bar: General -> Workspace; Change "Text file encoding" from Cp1252 to ISO-8859-1.
You can also change it for a specific debug configuration if you open up "Debug Configurations", go to the "Common" tab, change "Encoding".
Edit: Very weird. Performing a glob on "./J*" in a directory with a "João" directory works fine for me in Eclipse with UTF-8 and Cp1252 (the default) in 64-bit Python 2.7.2, Windows 7.
UTF-8:
['.\Jo\xe3o']
Cp1252:
['.\Jo\xe3o']
精彩评论