开发者

Why does .NET 4.0 sort this array differently than .NET 3.5?

This stackoverflow question raised an interesting question about sorting double arrays with NaN values. The OP posted the following code:

static void Main(string[] args)
{
    double[] someArray = { 4.0, 2.0, double.NaN, 1.0, 5.0, 3.0, double.NaN, 10.0, 9.0, 8.0 };

    foreach (double db in someArray)
    {
        Console.WriteLine(db);
    }

    Array.Sort(someArray);
    Console.WriteLine("\n\n");
    foreach (double db in someArray)
    {
        Console.WriteLine(db);
    }

    Console.ReadLine();
}

When you run this under the .NET 3.5 framework, the array is sorted as follows:

1,4,NaN,2,3,5,8,9,10,NaN

When you run it under 开发者_如何学Go.NET 4.0, the array is sorted somewhat more logically:

NaN,NaN,1,2,3,4,5,8,9,10

I can understand why it would sort weirdly in .NET 3.5 (because NaN is not equal to, less than, or greater than, anything). I can also understand why it would sort the way it does in .NET 4.0. My question is, why did this change from 3.5 to 4.0? And where is the Microsoft documentation for this change?


It's a bug fix. The feedback report with the bug details is here. Microsoft's response to the bug report:

Note that this bug affects the following:

  • Array.Sort(), where the array contains Double.NaN
  • Array.Sort(), where the array contains Single.NaN
  • any callers of above, for example on List.Sort(), where list contains Double.NaN

This bug will be fixed in the next major version of the runtime; until then you can work around this by using a custom IComparer that does the correct sorting. As mentioned in the workaround comments, don't use Comparer.Default, because this is special-cased with a shortcut sort routine that doesn't handle NaN correctly. Instead, you can provide your own comparer that provides an equivalent comparision, but won't be special-cased.


Not really an answer, but perhaps a clue... You can reproduce the weird 3.5 behavior in 4.0 with this code:

void Main()
{
    double[] someArray = { 4.0, 2.0, double.NaN, 1.0, 5.0, 3.0, double.NaN, 10.0, 9.0, 8.0 };
    Array.Sort(someArray, CompareDouble);
    someArray.Dump();
}

int CompareDouble(double a, double b)
{
    if (a > b)
        return 1;
    if (a < b)
        return -1;
    return 0;
}

Here, both a > b and a < b return false if a or b is NaN, so the CompareDouble method returns 0, so NaN is considered equal to everything... This gives the same result as in 3.5:

1,4,NaN,2,3,5,8,9,10,NaN


I don't have the code for the .NET 3.5 runtime to verify this, but I'm thinking they fixed a bug in the default comparer for double to bring it into line with the documentation.

According to that document, Double.Compare treats NaN as equal to PositiveInfinity and NegativeInfinity, and less than any other value.

The documentation is the same for .NET 3.5 and .NET 4.0, so I have to think that this was a bug fix to make the code work as documented.

EDIT:

After reading the comments in the linked question, I have to think that the problem wasn't in Double.Compare, but rather in whatever method Array.Sort (which is what List.Sort depends on) uses to compare Double values. Somehow, I don't think it's really Double.Compare.


[This is my response shameless ripped from the other post. It would be nice is someone explored this further -- double.CompareTo and double.CompareTo(double) are well-defined, as indicated below so I suspect there is some Array.Sort magic happening for the specific type.]

Array.Sort(double[]): doesn't seem to be using CompareTo(double[]) as expected and this may very well be a bug -- note the difference in Array.Sort(object[]) and Array.Sort(double[]) below. I would love clarification/corrections on the following.

First, the double.CompareTo(T) method documentation -- this ordering is well-defined according to the documentation:

Less than zero: This instance is less than value. -or- This instance is not a number (NaN) and value is a number.

Zero: This instance is equal to value. -or- Both this instance and value are not a number (NaN), PositiveInfinity, or NegativeInfinity.

Greater than zero: This instance is greater than value. -or- This instance is a number and value is not a number (NaN).

In LINQPad (3.5 and 4, both have same results):

0d.CompareTo(0d).Dump();                  // 0
double.NaN.CompareTo(0d).Dump();          // -1
double.NaN.CompareTo(double.NaN).Dump();  // 0
0d.CompareTo(double.NaN).Dump();          // 1

Using CompareTo(object) has the same results:

0d.CompareTo((object)0d).Dump();                  // 0
double.NaN.CompareTo((object)0d).Dump();          // -1
double.NaN.CompareTo((object)double.NaN).Dump();  // 0
0d.CompareTo((object)double.NaN).Dump();          // 1

So that's not the problem.

Now, from the Array.Sort(object[]) documentation -- there is no use of >, < or == (according to the documentation) -- just CompareTo(object).

Sorts the elements in an entire one-dimensional Array using the IComparable implementation of each element of the Array.

Likewise, Array.Sort(T[]) uses CompareTo(T).

Sorts the elements in an entire Array using the IComparable(Of T) generic interface implementation of each element of the Array.

Let's see:

LINQPad (4):

var ar = new double[] {double.NaN, 0, 1, double.NaN};
Array.Sort(ar);
ar.Dump();
// NaN, NaN, 0, 1

LINQPad (3.5):

var ar = new double[] {double.NaN, 0, 1, double.NaN};
Array.Sort(ar);
ar.Dump();
// NaN, 0, NaN, 1

LINQPad (3.5) -- NOTE THE ARRAY IS OF OBJECT and the behavior is "expected" per the CompareTo contract.

var ar = new object[] {double.NaN, 0d, 1d, double.NaN};
Array.Sort(ar);
ar.Dump();
// NaN, NaN, 0, 1

Hmm. Really. In conclusion:

I HAVE NO IDEA -- but I suspect there is some "optimization" resulting in CompareTo(double) not being invoked.

Happy coding.

0

上一篇:

下一篇:

精彩评论

暂无评论...
验证码 换一张
取 消

最新问答

问答排行榜