sorting collections with string elements in java
I have a collections...i wrote code to sort values using my own comparator my comparator code is
private static class MatchComparator implements Comparator<xmlparse> {
@Override
public int compare(xmlparse object1, xmlparse object2) {
St开发者_开发技巧ring match1 = object1.getMatchId();
String match2 = object2.getMatchId();
return match1.compareTo(match2);
}
}
I will call Collections.sort(list,new MatchComparator());
Everything is fine but my problem is the sorted list is wrong when i print it...
Input for list
Match19
Match7
Match12
Match46
Match32
output from the sorted list
Match12
Match19
Match32
Match46
Match7
my expected output is
Match7
Match12
Match19
Match32
Match46
to get the order you need, you could either prefix the 1 digit numbers with zero ( eg Match07 ) or you have to split the string in a prefix and a numeric part, and implement the sorting as numeric comparison
The problem is that String.compareTo(..)
compares the words char by char.
If all string start with Match
, then you can easily fix this with:
public int compare(xmlparse object1, xmlparse object2) {
String match1 = object1.getMatchId();
String match2 = object2.getMatchId();
return Integer.parseInt(match1.replace("Match"))
- Integer.parseInt(match2.replace("Match"));
}
In case they don't start all with Match
, then you can use regex:
Integer.parseInt(object1.replaceAll("[a-zA-Z]+", ""));
Or
Integer.parseInt(object1.replaceAll("[\p{Alpha}\p{Punch}]+", ""));
And a final note - name your classes with uppercase, camelCase - i.e. XmlParse
instead of xmlparse
- that's what the convention dictates.
Implement a function and use it for comparison:
instead of
return match1.compareTo(match2);
use
return compareNatural(match1,match2);
Here is a function which does a natural comparison on strings:
private static int compareNatural(String s, String t, boolean caseSensitive) {
int sIndex = 0;
int tIndex = 0;
int sLength = s.length();
int tLength = t.length();
while (true) {
// both character indices are after a subword (or at zero)
// Check if one string is at end
if (sIndex == sLength && tIndex == tLength) {
return 0;
}
if (sIndex == sLength) {
return -1;
}
if (tIndex == tLength) {
return 1;
}
// Compare sub word
char sChar = s.charAt(sIndex);
char tChar = t.charAt(tIndex);
boolean sCharIsDigit = Character.isDigit(sChar);
boolean tCharIsDigit = Character.isDigit(tChar);
if (sCharIsDigit && tCharIsDigit) {
// Compare numbers
// skip leading 0s
int sLeadingZeroCount = 0;
while (sChar == '0') {
++sLeadingZeroCount;
++sIndex;
if (sIndex == sLength) {
break;
}
sChar = s.charAt(sIndex);
}
int tLeadingZeroCount = 0;
while (tChar == '0') {
++tLeadingZeroCount;
++tIndex;
if (tIndex == tLength) {
break;
}
tChar = t.charAt(tIndex);
}
boolean sAllZero = sIndex == sLength || !Character.isDigit(sChar);
boolean tAllZero = tIndex == tLength || !Character.isDigit(tChar);
if (sAllZero && tAllZero) {
continue;
}
if (sAllZero && !tAllZero) {
return -1;
}
if (tAllZero) {
return 1;
}
int diff = 0;
do {
if (diff == 0) {
diff = sChar - tChar;
}
++sIndex;
++tIndex;
if (sIndex == sLength && tIndex == tLength) {
return diff != 0 ? diff : sLeadingZeroCount - tLeadingZeroCount;
}
if (sIndex == sLength) {
if (diff == 0) {
return -1;
}
return Character.isDigit(t.charAt(tIndex)) ? -1 : diff;
}
if (tIndex == tLength) {
if (diff == 0) {
return 1;
}
return Character.isDigit(s.charAt(sIndex)) ? 1 : diff;
}
sChar = s.charAt(sIndex);
tChar = t.charAt(tIndex);
sCharIsDigit = Character.isDigit(sChar);
tCharIsDigit = Character.isDigit(tChar);
if (!sCharIsDigit && !tCharIsDigit) {
// both number sub words have the same length
if (diff != 0) {
return diff;
}
break;
}
if (!sCharIsDigit) {
return -1;
}
if (!tCharIsDigit) {
return 1;
}
} while (true);
} else {
// Compare words
// No collator specified. All characters should be ascii only. Compare character-by-character.
do {
if (sChar != tChar) {
if (caseSensitive) {
return sChar - tChar;
}
sChar = Character.toUpperCase(sChar);
tChar = Character.toUpperCase(tChar);
if (sChar != tChar) {
sChar = Character.toLowerCase(sChar);
tChar = Character.toLowerCase(tChar);
if (sChar != tChar) {
return sChar - tChar;
}
}
}
++sIndex;
++tIndex;
if (sIndex == sLength && tIndex == tLength) {
return 0;
}
if (sIndex == sLength) {
return -1;
}
if (tIndex == tLength) {
return 1;
}
sChar = s.charAt(sIndex);
tChar = t.charAt(tIndex);
sCharIsDigit = Character.isDigit(sChar);
tCharIsDigit = Character.isDigit(tChar);
} while (!sCharIsDigit && !tCharIsDigit);
}
}
}
a better one is here
The comparison is lexicographic and not numerical, that's your problem. In lexicoraphic ordering, 10 comes before 9.
See this question for open source implementation solutions. You can also implement your own string comparison, which shouldn't be that hard.
It seems that you expect not what the String.compareTo() really is. It performs so called lexicographical comarsion, but you try to compare it by number. You need to modify the code of your comparator.
@Override
public int compare(xmlparse object1, xmlparse object2) {
String match1 = object1.getMatchId();
String match2 = object2.getMatchId();
Long n1 = getNumber(match1);
Long n2 = getNumber(match2);
return n1.compareTo(n2);
}
where getNumber() extracts the last nuber from string "matchXX"
精彩评论