Is there a performance difference between these two algorithms for shuffling an IEnumerable?

2023-01-30 13:53 问答作者：

These two questions give similar algorthims for shuffling an IEnumerable:

C#: Is using Random and OrderBy a good shuffle algorithm?
Can you enumerate a collection in C# out of order?

Here are the two methods side-by-side:

public static IEnumerable<T> Shuffle1<T> (this IEnumerable<T> source)
{
    Random random = new Random ();
    T [] copy = source.ToArray ();

    for (int i = copy.Length - 1; i >= 0; i--) {
        int index = random.Next (i + 1);
        yield return copy [index];
        copy [index] = copy [i];
    }
}


public static IEnumerable<T> Shuffle2<T> (this IEnumerable<T> source)
{
    Random random = new Random ();
    List<T> copy = source.ToList ();

    while (copy.Count > 0) {
        int index = random.Next (copy.Count);
        yield return copy [index];
        copy.RemoveAt (index);
    }
}

They are basically identical, except one uses a List, and one uses an array. Conceptually, the second one seems more clear 开发者_开发百科to me. But is there a substantial performance benefit to be gained from using an array? Even if the Big-O time is the same, if it is several times faster, it could make a noticeable difference.

The second version will probably be a bit slower because of RemoveAt. Lists are really arrays that grow when you add elements to them, and as such, insertion and removal in the middle is slow (in fact, MSDN states that RemoveAt has an O(n) complexity).

Anyway, the best would be to simply use a profiler to compare both methods.

The first doesn't compile, although it's apparent that you're trying to reify the enumerable, and then implement Fisher-Yates; that's probably the correct approach, and it shouldn't be unclear to anyone who has ever shuffled an array before. The second using RemoveAt is bad for the reasons stated by other commenters.

EDIT: Your top implementation looks like it's correct now, and it's a good way to do it.

The first algorithm is O(n) as it has a loop which performs an O(1) swap on each iteration. The second algorithm is O(n^2) as it performs an O(n) RemoveAt operation on each iteration. In addition indexers on lists are slower than indexes on arrays because the former is a method call whereas the latter is an IL instruction.

So out of the two the first one is likely to be faster. That said, if you're after performance, why bother yielding the results? It's already converting to an array so just shuffle that in place and return the array directly (or wrapped in a ReadOnlyCollection<T> if you're worried about people changing it) which is probably faster still.

On a side note, both methods have bugs that the behaviour of Random when used by multiple threads is undefined, so they should probably use a thread-safe random number generator.

继续阅读：ienumerable linq performance

Is there a performance difference between these two algorithms for shuffling an IEnumerable?

更多精彩内容

精彩评论

最新问答

央视是哪个频道？

请问买过的朋友，舒提啦旅行箱实际使用体验如何？？

检查不孕不育需要的费用？

海信ULED电视画质有什么不同的地方?？

钉子可以挂的住画框幕布吗？

问答排行榜

河神2九牛入海钓河妖是第几集河妖什么来历可活吞牛？

性激素六项检查的最佳时间是多久？多少钱？？

Easiest way to get words of one line from istream into a vector?

《梦在燃烧 (《三国演义》动画片主题曲)》MP3歌词-汤子星？

抽烟只抽炫赫门？