开发者

Compare letters from different languages

There are some letters in different alphabets, that are looking totally the same.

Like A in latin and А in cyrillic.

Do they play the same role, when I call one of them through utf-8 script?

开发者_运维知识库

If aren't, how to get know code of given letter?


It's not clear what you mean by "play the same role".

They are certainly not the same character, though they may appear to be when rendered.

This is exactly analogous as the confusion between "l" (lowercase L) and "I" (uppercase i) in many fonts.

If you want to consider A and А to be the same, you have to transliterate the Cyrillic into a Latin one. Unfortunately, PHP support for transliteration is sketchy. You can use iconv, which is not great -- if you transliterate to ASCII, you'll lose everything that cannot be represented in ASCII.

The Unicode PHP implementation (what was supposed to be PHP 6) had a function called str_transliterate that used the ICU transliteration API. Hopefully, transliteration will be added to the intl extension (the current ICU wrapper) in the future.


You might be interested in the 'spoof detection' API in ICU. I think it is designed to report that your two As are 'visually confusable'.

0

上一篇:

下一篇:

精彩评论

暂无评论...
验证码 换一张
取 消

最新问答

问答排行榜