Alphabetize Arabic and Japanese text that is in Unicode?
Does anyone have any code for alphabetizing Arabic and Japanese text that is in U开发者_如何学JAVAnicode? If the code was in ruby that would be great.
Unicode code points are not listed in alphabetic order (Z < a, for example), but they try to be approximately in that order anyway. There is a canonical unicode order, defined by the Unicode Collation Algorithm and they are also language-specific ordering (french order is not exacly the same as german or czech order, even with the same alphabet), which can be specified in locale information. I think the ICU library contains the language specific algorithms you are looking for.
I don't know Ruby, but python has a function, ord() that translates a unicode special character to its unicode code point. For example,
>>> a = u'ل'
>>> ord(a)
0: 1604
>>> b = u'ع'
>>> ord(b)
1: 1593
Look for something like that in Ruby. I assume that the Arabic symbols are listed in unicode in alphabetic order.
To ask the obvious question, what don't you like about mylist.sort?
Depending on your needs words.sort in ruby will be fine for Japanese. The order the characters appear in Unicode are in a reasonably good sorting order. Can't vouch for Arabic though, but my guess is that it's ok as well.
mylist.sort should work out of the box in Ruby 1.9 (which has built-in unicode support). In Ruby 1.8, where Unicode support isn't built in, I think you'd have to use the character-encodings gem extend the String class with UTF-8 string comparisions. (And then mylist.sort would work.)
 
         加载中,请稍侯......
 加载中,请稍侯......
      
精彩评论