Problem with cx_freeze and utf-8 characters not showing
I'm trying to compile a python script which contains spanish strings.
If i run the .py, it's displayed correctly. Compilation runs fine, but when I run the resulting .exe, the non-ascii characters are replaced with error chars, and no error reported.
I couldn't find anyone asking about the same problem, am I the only one trying to compile an ñ or am I missing something in my compilation?
I'm using pyt开发者_如何学Pythonhon 3.1.2 with cx_freeze 4.2.1 on win xp. The problem is consistent usin basic compilation (\Scripts\cxfreeze) and advanced (setup.py)
test code, main.py
# coding=UTF-8
print('mensaje de prueba \u00e1ñ ó \xf1')
running .py
running .exe
EDIT:
frozen Machin test source
It is not possible to be certain, but assuming that what appears to be in your source file and what appears to be displayed has not been transmogrified in transmission, your problem is this:
You expect to see (a-acute, n-tilde, o-acute), but you actually see "error characters" (no-break space aka NBSP, currency sign, cent sign).
I don't have cxfreeze. My guess is that cxfreeze is doubly encoding your output. This is based on running the following source file using Python 3.2.0 on Windows 7. You will notice that I have used escape sequences for the text characters in order to rule out any noise caused by source encoding problems.
# coding: ascii ... what you see is what you've got.
# expected output: a-acute(e1) n-tilde(f1) o-acute(f3)
import sys
import unicodedata as ucd
text = '\xe1\xf1\xf3'
print("expected output:")
for c in text:
print(ascii(c), ucd.name(c))
print("seen output[%s]" % text)
sse = sys.stdout.encoding
print(sse)
print("Expected raw bytes output:", text.encode(sse))
whoops = text.encode(sse).decode('latin1')
print("whoops:")
for w in whoops:
print(ascii(w), ucd.name(w))
and here is its output.
expected output:
'\xe1' LATIN SMALL LETTER A WITH ACUTE
'\xf1' LATIN SMALL LETTER N WITH TILDE
'\xf3' LATIN SMALL LETTER O WITH ACUTE
seen output[áñó]
cp850
Expected raw bytes output: b'\xa0\xa4\xa2'
whoops:
'\xa0' NO-BREAK SPACE
'\xa4' CURRENCY SIGN
'\xa2' CENT SIGN
In the brackets after "seen output", I see a-acute, n-tilde, and o-acute as expected. Please run the script with and without cxfreezing, and report (in words) what you see. If the frozen "seen output" is in fact a space followed by a currency sign and a cent sign, you should report the problem (with a link to this answer) to the cxfreeze maintainer.
精彩评论