开发者

How to sort list with diacritics without removing diacritic

How to sort a list that contains letters with diacritic markings?

Words used in this example are made up.

Now I get a list that displays this:

  • báb
  • baz
  • bez

But 开发者_开发技巧I want to get a list that displays this:

  • baz
  • báb
  • bez

Showing the diacritic as a letter on its own. Is there a way to do this in C#?


If you set the culture of the current thread to the language you want to sort for then this should work automagically (assuming you don't want some special customized sort order). Like this

List<string> mylist;
....
Thread.CurrentThread.CurrentCulture = new CultureInfo("pl-PL");
mylist.Sort();

Should get you the list sorted according to the Polish culture settings.

Update: If the culture settings don't sort it the way you want then another option is to implement your own string comparer.

Update 2: String comparer example:

public class DiacriticStringComparer : IComparer<string>
{
    private static readonly HashSet<char> _Specials = new HashSet<char> { 'é', 'ń', 'ó', 'ú' };

    public int Compare(string x, string y)
    {
        // handle special cases first: x == null and/or y == null,  x.Equals(y)
        ...

        var lengthToCompare = Math.Min(x.Length, y.Length);
        for (int i = 0; i < lengthToCompare; ++i)
        {
            var cx = x[i];
            var cy = y[i];

            if (cx == cy) continue;

            if (_Specials.Contains(cx) || _Specials.Contains(cy))
            {
                // handle special diacritics comparison
                ...
            }
            else
            {
                // cx must be unequal to cy -> can only be larger or smaller
                return cx < cy ? -1 : 1;
            }
        }
        // once we are here the strings are equal up to lengthToCompare characters
        // we have already dealt with the strings being equal so now one must be shorter than the other
        return x.Length < y.Length ? -1 : 1;
    }
}

Disclaimer: I haven't tested it but it should give you the general idea. Also char.CompareTo() does not compare lexicographically but according to one source I found < and > does - can't guarantee it though. Worst case you have to convert cx and cy into strings and then use the default string comparison.

0

上一篇:

下一篇:

精彩评论

暂无评论...
验证码 换一张
取 消

最新问答

问答排行榜