开发者

How to detect locale/language if the locale doesn't have a codepage?

I need to detect the language from a unicode widestring. I have tried using the iMultiLang2 interface and that properly works if the locale has a codepage. Some locales/languages do not have cod开发者_JAVA技巧epages and are mapped to unicode only. How can I get the lcid for those? Georgian,Hindi and many other languages do not have codepages and are unicode only collations

I am using Delphi7 Enterprise.

Would really appreciate any help

Regards


The question is based on a misunderstanding of unicode. Unicode is a way of representing writing systems, not languages. Imagine a unicode string consisting of the three code-points U+0073, U+0069, and U+006e, that is, "sin". Is it English? Is it the Spanish word for "without"? Is it "that" in any of several Scandinavian languages? Who knows.

You mention Georgian and Hindi. Georgian script (ქართული დამწერლობა) can be used to represent Georgian, of course, but also Mingrelian, Svan, and some other even rarer languages. There is no "Hindi" script, any more than there are "English" letters. As English is written in Latin letters that we inherited from our Latin-speaking forbearers, Hindi is written in Devanāgarī (देवनागरी), a beautiful script that is also used for ancient Sanskrit and modern Marathi and Nepali and dozens of other languages. And don't get me started on Chinese.

If you are pressed and have to accept a hackish near-solution, you can make approximations: "since this character is from the Devanāgarī range (U+0900–U+097F) or the Georgian ranges (U+10A0–U+10FC and U+2D00–U+2D25), I'll assume it is probably Hindi or probably Georgian." Such a method would be error-prone and vague, but you could start with the range table here.


I usually do not give this kind of answers but anyway You don't!. This is kind of task you cannot really solve. There are too many cases where you cannot determine the language.

BTW, The only place where I observed a feature like this was on Google Translator and I does work only if the text length is quite big and even so there is no guarantee.

0

上一篇:

下一篇:

精彩评论

暂无评论...
验证码 换一张
取 消

最新问答

问答排行榜