开发者

C# QuickSort too slow

I'm learning different types of sorting now, and I found out that, starting from a certain point, my QuickSort algorithm doesn't work that quick at all.

Here is my code:

class QuickSort
    {

       // partitioning array on the key so that the left part is <=key, right part > key
            private int Partition(int[开发者_如何学编程] arr, int start, int end)
            {
                    int key = arr[end];
                    int i = start - 1;
                    for (int j = start; j < end; j++)
                    {
                            if (arr[j] <= key) Swap(ref arr[++i], ref arr[j]);
                    }
                    Swap(ref arr[++i], ref arr[end]);
                    return i;
            }


            // sorting
            public void QuickSorting(int[] arr, int start, int end)
            {
                    if (start < end)
                    {
                            int key = Partition(arr, start, end);
                            QuickSorting(arr, start, key - 1);
                            QuickSorting(arr, key + 1, end);
                    }
            }
      }


    class Test
    {
            static void Main(string[] args)
            {                       
                    QuickSort quick = new QuickSort();
                    Random rnd = new Random(DateTime.Now.Millisecond);

                    int[] array = new int[1000000];

                    for (int i = 0; i < 1000000; i++)
                    {
                            int i_rnd = rnd.Next(1, 1000);
                            array[i] = i_rnd;
                    }

                    quick.QuickSorting(array, 0, array.Length - 1);

            }
      }

It takes about 15 seconds to run this code on an array of a million elements. While, for example, MergeSort or HeapSort do the same in less than a second.

Could you please explain to me why this can happen?


How quick your sort is and which algorithm you should use depends a lot of your input data. Is it random, nearly sorted, reversed etc.

There's a very nice page that illustrates how the different sorting algorithms work:

  • Sorting Algorithm Animations


Have you considered inlining the Swap method? It shouldn't be hard to do so, but it may be that the JIT is finding it hard to inline.

When I implemented quicksort for Edulinq I didn't see this problem at all - you may want to try my code (the simplest, recursive form probably) to see how that performs for you. If it does well, try to work out where the differences are.

While different algorithms will behave differently with the same data, I wouldn't expect to see this much difference on randomly-generated data.


You have 1,000,000 random elements with 1,000 distinct values. So, we can expect most values to appear about 1,000 times in your array. This gives you some quadratic O(n^2) running time.

To partition the array in 1,000 pieces, where every partition contains the same number, happens at a stack depth of about log2(1000), about 10. (That is assuming a call to partition neatly breaks it up in two pieces.) That's about 10,000,000 operations.

To quicksort the last 1,000 partitions, all containing 1,000 identical values. We need 1,000 times 1,000 + 999 + 998 + ... + 1 comparisons. (At every round quicksort reduces the problem by one, only removing the key/pivot.) That gives 500,000,000 operations. The most ideal way of quicksort 1,000 partitions would be 1,000 times 1,000*10 operations = 10,000,000. Because of the identical values, you hit a quadratic case here, quicksort's worst case performance. So, about halfway down the quicksort, it goes to worst case behavior.

If every value occurs only a few times, it doesn't matter if you sort those few tiny partitions in O(N^2) or O(N logN). But here we had a lot and huge partitions to be sorted in O(N^2).


To improve your code: partition in 3 pieces. Smaller than the pivot, equal to the pivot and bigger than the pivot. Then, only quicksort the first and last partitions. You will need to do an extra compare; test for equality first. But I think, for this input, it would be a lot faster.

0

上一篇:

下一篇:

精彩评论

暂无评论...
验证码 换一张
取 消

最新问答

问答排行榜