How to get unicode month name in Python?
I am trying to get a unicode version of calendar.month_abbr[6]
. If I don't specify an encoding for the locale, I don't know how to convert the string to unicode. The exampl开发者_运维百科e code below shows my problem:
>>> import locale
>>> import calendar
>>> locale.setlocale(locale.LC_ALL, ("ru_RU"))
'ru_RU'
>>> print repr(calendar.month_abbr[6])
'\xb8\xee\xdd'
>>> print repr(calendar.month_abbr[6].decode("utf8"))
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
File "/usr/lib/python2.5/encodings/utf_8.py", line 16, in decode
return codecs.utf_8_decode(input, errors, True)
UnicodeDecodeError: 'utf8' codec can't decode byte 0xb8 in position 0: unexpected code byte
>>> locale.setlocale(locale.LC_ALL, ("ru_RU", "utf8"))
'ru_RU.UTF8'
>>> print repr(calendar.month_abbr[6])
'\xd0\x98\xd1\x8e\xd0\xbd'
>>> print repr(calendar.month_abbr[6].decode("utf8"))
u'\u0418\u044e\u043d'
Any ideas how to solve this? The solution doesn't have to look like this. Any solution that gives me the abbreviated month name in unicode is fine.
Change the last line in your code:
>>> print calendar.month_abbr[6].decode("utf8")
Июн
Improperly used repr()
hides from you that you already get what you needed.
Also getlocale()
can be used to get encoding for current locale:
>>> locale.setlocale(locale.LC_ALL, 'en_US')
'en_US'
>>> locale.getlocale()
('en_US', 'ISO8859-1')
Another modules that might be useful for you:
- PyICU - a better way for internationalization. While
locale
produce either initial or inflected form of month name depending on locale database in your OS (so you can't rely on it for such languages like Russian!) and uses some encoding,PyICU
has different format specifiers for initial and inflected form (so you can select appropriate in your case) and uses unicode. - pytils - a set of tools to work with Russian language, including dates. It has hard-coded month names as workaround for
locale
limitations.
What you need is:
…
myencoding= locale.getpreferredencoding()
print repr(calendar.month_abbr[6].decode(myencoding))
…
精彩评论