开发者

How to get unicode month name in Python?

I am trying to get a unicode version of calendar.month_abbr[6]. If I don't specify an encoding for the locale, I don't know how to convert the string to unicode. The exampl开发者_运维百科e code below shows my problem:

>>> import locale
>>> import calendar
>>> locale.setlocale(locale.LC_ALL, ("ru_RU"))
'ru_RU'
>>> print repr(calendar.month_abbr[6])
'\xb8\xee\xdd'
>>> print repr(calendar.month_abbr[6].decode("utf8"))
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "/usr/lib/python2.5/encodings/utf_8.py", line 16, in decode
    return codecs.utf_8_decode(input, errors, True)
UnicodeDecodeError: 'utf8' codec can't decode byte 0xb8 in position 0: unexpected code byte
>>> locale.setlocale(locale.LC_ALL, ("ru_RU", "utf8"))
'ru_RU.UTF8'
>>> print repr(calendar.month_abbr[6])
'\xd0\x98\xd1\x8e\xd0\xbd'
>>> print repr(calendar.month_abbr[6].decode("utf8"))
u'\u0418\u044e\u043d'

Any ideas how to solve this? The solution doesn't have to look like this. Any solution that gives me the abbreviated month name in unicode is fine.


Change the last line in your code:

>>> print calendar.month_abbr[6].decode("utf8")
Июн

Improperly used repr() hides from you that you already get what you needed.

Also getlocale() can be used to get encoding for current locale:

>>> locale.setlocale(locale.LC_ALL, 'en_US')
'en_US'
>>> locale.getlocale()
('en_US', 'ISO8859-1')

Another modules that might be useful for you:

  • PyICU - a better way for internationalization. While locale produce either initial or inflected form of month name depending on locale database in your OS (so you can't rely on it for such languages like Russian!) and uses some encoding, PyICU has different format specifiers for initial and inflected form (so you can select appropriate in your case) and uses unicode.
  • pytils - a set of tools to work with Russian language, including dates. It has hard-coded month names as workaround for locale limitations.


What you need is:

…
myencoding= locale.getpreferredencoding()
print repr(calendar.month_abbr[6].decode(myencoding))
…
0

上一篇:

下一篇:

精彩评论

暂无评论...
验证码 换一张
取 消

最新问答

问答排行榜