Python / Mako : How to get unicode strings/characters parsed correctly?

2023-01-17 04:42 问答作者：

I'm trying to get Mako render some string with unicode characters :

tempLook=TemplateLookup(..., default_filters=[], input_encoding='utf8',output_encoding='utf-8', encoding_errors='replace')
...
print sys.stdout.encoding
uname=cherrypy.sess开发者_如何学Goion['userName']
print uname
kwargs['_toshow']=uname
...
return tempLook.get_template(page).render(**kwargs)

The related template file :

...${_toshow}...

And the output is :

UTF-8
Deşghfkskhü
...
UnicodeDecodeError: 'ascii' codec can't decode byte 0xc5 in position 1: ordinal not in range(128)

I don't think there's any problem with the string itself since I can print it just fine.

Altough I've played (a lot) with input/output_encoding and default_filters parameters, it always complains about being unable to decode/encode with ascii codec.

So I decided to try out the example found on the documentation, and the following works the "best" :

input_encoding='utf-8', output_encoding='utf-8'
#(note : it still raised an error without output_encoding, despite tutorial not implying it)

With

${u"voix m’a réveillé."}

And the result being

voix mâ�a rÃ©veillÃ©

I simply don't get why this doesn't work. "Magic encoding comment"s don't work either. All the files are encoded with UTF-8.

I've spent hours to no avail, am I missing something ?

~~Update :~~

~~I have a simpler question now :~~

~~Now that all the variables are unicode, how can I get Mako to render unicode strings without applying anything ? Passing a blank filter / render_unicode() doesn't help.~~

Yes, UTF-8 != Unicode.

UTF-8 is a specifc string encoding, as are ASCII and ISO 8859-1. Try this:

For any input string do a inputstring.decode('utf-8') (or whatever input encoding you get). For any output string do a outputstring.encode('utf-8')(or whatever output encoding you want). For any internal use, take unicode strings ('this is a normal string'.decode('utf-8') == u'this is a normal string')

'foo' is a string, u'foo' is a unicode string, which doesn't "have" an encoding (can't be decoded). SO anytime python want to change an encoding of a normal string, it first tries to "decode" it, the to "encode" it. And the default is "ascii", which fails more often than not :-)

继续阅读：mako python string unicode

Python / Mako : How to get unicode strings/characters parsed correctly?

更多精彩内容

精彩评论

最新问答

央视是哪个频道？

请问买过的朋友，舒提啦旅行箱实际使用体验如何？？

检查不孕不育需要的费用？

海信ULED电视画质有什么不同的地方?？

钉子可以挂的住画框幕布吗？

问答排行榜

河神2九牛入海钓河妖是第几集河妖什么来历可活吞牛？

性激素六项检查的最佳时间是多久？多少钱？？

Easiest way to get words of one line from istream into a vector?

《梦在燃烧 (《三国演义》动画片主题曲)》MP3歌词-汤子星？

抽烟只抽炫赫门？