开发者

Is there a performance difference between these two algorithms for shuffling an IEnumerable?

These two questions give similar algorthims for shuffling an IEnumerable:

  • C#: Is using Random and OrderBy a good shuffle algorithm?
  • Can you enumerate a collection in C# out of order?

Here are the two methods side-by-side:

public static IEnumerable<T> Shuffle1<T> (this IEnumerable<T> source)
{
    Random random = new Random ();
    T [] copy = source.ToArray ();

    for (int i = copy.Length - 1; i >= 0; i--) {
        int index = random.Next (i + 1);
        yield return copy [index];
        copy [index] = copy [i];
    }
}


public static IEnumerable<T> Shuffle2<T> (this IEnumerable<T> source)
{
    Random random = new Random ();
    List<T> copy = source.ToList ();

    while (copy.Count > 0) {
        int index = random.Next (copy.Count);
        yield return copy [index];
        copy.RemoveAt (index);
    }
}

They are basically identical, except one uses a List, and one uses an array. Conceptually, the second one seems more clear 开发者_开发百科to me. But is there a substantial performance benefit to be gained from using an array? Even if the Big-O time is the same, if it is several times faster, it could make a noticeable difference.


The second version will probably be a bit slower because of RemoveAt. Lists are really arrays that grow when you add elements to them, and as such, insertion and removal in the middle is slow (in fact, MSDN states that RemoveAt has an O(n) complexity).

Anyway, the best would be to simply use a profiler to compare both methods.


The first doesn't compile, although it's apparent that you're trying to reify the enumerable, and then implement Fisher-Yates; that's probably the correct approach, and it shouldn't be unclear to anyone who has ever shuffled an array before. The second using RemoveAt is bad for the reasons stated by other commenters.

EDIT: Your top implementation looks like it's correct now, and it's a good way to do it.


The first algorithm is O(n) as it has a loop which performs an O(1) swap on each iteration. The second algorithm is O(n^2) as it performs an O(n) RemoveAt operation on each iteration. In addition indexers on lists are slower than indexes on arrays because the former is a method call whereas the latter is an IL instruction.

So out of the two the first one is likely to be faster. That said, if you're after performance, why bother yielding the results? It's already converting to an array so just shuffle that in place and return the array directly (or wrapped in a ReadOnlyCollection<T> if you're worried about people changing it) which is probably faster still.

On a side note, both methods have bugs that the behaviour of Random when used by multiple threads is undefined, so they should probably use a thread-safe random number generator.

0

上一篇:

下一篇:

精彩评论

暂无评论...
验证码 换一张
取 消

最新问答

问答排行榜