开发者

Transliteration from Ethiopic (and others) to ASCII (ሀ -> ha; ü -> ue)

I am not yet so good with reading Amharic (Geez / Ethiopic) letters.

If I have a text in Ge'ez (Ethiopia) letters ( http://en.wikipedia.org/wiki/Ge%27ez_language ) I want to transliterate them to ASCII.

When I go with the LYNX Textmode browser to http://www.addismap.com/am/ (webpage in Amharic) it showes me "edis map: yeedis ebeba karta". How can I access this functionality for example in Python, Bash or PHP? Which API do they use?

It seems not to be i开发者_如何学Goconv:

$ iconv -f UTF-8 -t ASCII//TRANSLIT
Input:    ሀ ለ ሐ መ ሠ ረ ሰ
Output:   ? ? ? ? ? ? ?


ICU http://icu-project.org/ has an Amharic-Latin transform, which will turn your text into "hā le ḥā me še re se". You could use this using uconv -x 'Amharic/BGN-Latin' from the command line, or use pyicu.


The Unicode Common Locale Data Repository defines some transliterations. Unidecode (or its Python port) has even more of them.

0

上一篇:

下一篇:

精彩评论

暂无评论...
验证码 换一张
取 消

最新问答

问答排行榜