Transliteration from Ethiopic (and others) to ASCII (ሀ -> ha; ü -> ue)
I am not yet so good with reading Amharic (Geez / Ethiopic) letters.
If I have a text in Ge'ez (Ethiopia) letters ( http://en.wikipedia.org/wiki/Ge%27ez_language ) I want to transliterate them to ASCII.
When I go with the LYNX Textmode browser to http://www.addismap.com/am/ (webpage in Amharic) it showes me "edis map: yeedis ebeba karta". How can I access this functionality for example in Python, Bash or PHP? Which API do they use?
It seems not to be i开发者_如何学Goconv:
$ iconv -f UTF-8 -t ASCII//TRANSLIT
Input: ሀ ለ ሐ መ ሠ ረ ሰ
Output: ? ? ? ? ? ? ?
ICU http://icu-project.org/ has an Amharic-Latin transform, which will turn your text into "hā le ḥā me še re se". You could use this using uconv -x 'Amharic/BGN-Latin'
from the command line, or use pyicu.
The Unicode Common Locale Data Repository defines some transliterations. Unidecode (or its Python port) has even more of them.
精彩评论