Java Unicode strings sorting
In Java, how does Unicode strings get c开发者_运维百科ompared?
What I mean is, if I have a few say, Japanese strings, when I do the following:
java.util.Arrays.sort(arrayOfJapaneseStrings);
how does those strings get compared and sorted?
By default, Strings sort lexicographically, by Unicode order. The order is by UTF-16, so might not be exactly what you want for certain characters, but Japanese characters are all in the BMP, so you shouldn't have a problem with these.
If you would like a different sort order, you can use the java.text.Collator
classes to define a different sort order.
By default it's in UTF-16 byte-code comparison. This is the fastest way, and hence perfect if all you need is some order (e.g. if you are going to use a binary search later, you need them to be in order, but just what "in order" means doesn't matter, so the faster the better).
If you need an ordering that is sensible to a user in a given locale, use the java.text.Collator class.
According to compareTo
methodof String class. See the javadoc:
Compares two strings lexicographically. The comparison is based on the Unicode value of each character in the strings. The character sequence represented by this
String
object is compared lexicographically to the character sequence represented by the argument string. The result is a negative integer if thisString
object lexicographically precedes the argument string. The result is a positive integer if thisString
object lexicographically follows the argument string. The result is zero if the strings are equal;compareTo
returns0
exactly when the {@link #equals(Object)} method would returntrue
.
精彩评论