Why can LINQ operations be faster than a normal loop?

2023-01-17 02:11 问答作者：

A friend and I were a bit perplexed during a programming discussion today开发者_C百科. As an example, we created a fictive problem of having a List<int> of n random integers (typically 1.000.000) and wanted to create a function that returned the set of all integers that there were more than one of. Pretty straightforward stuff. We created one LINQ statement to solve this problem, and a plain insertion sort based algorithm.

Now, as we tested the speed the code ran at (using System.Diagnostics.StopWatch), the results were confusing. Not only did the LINQ code outperform the simple sort, but it ran faster than a single foreach/for that only did a single loop of the list, and that had no operations within (which, on a side track, I thought the compiler was supposed to discover and remove alltogether).

If we generated a new List<int> of random numbers in the same execution of the program and ran the LINQ code again, the performance would increase by orders of magnitude (typically thousandfold). The performance of the empty loops were of course the same.

So, what is going on here? Is LINQ using parallelism to outperform normal loops? How are these results even possible? LINQ uses quicksort which runs at n*log(n), which per definition is already slower than n.

And what is happening at the performance leap on the second run?

We were both baffled and intrigued at these results and were hoping for some clarifying insights from the community, just to satisfy our own curiosity.

Undoubtedly you haven't actually performed the query, you've merely defined it. LINQ constructs an expression tree that isn't actually evaluated until you perform an operation that requires that the enumeration be iterated. Try adding a ToList() or Count() operation to the LINQ query to force the query to be evaluated.

Based on your comment I expect this is similar to what you've done. Note: I haven't spent any time figuring out if the query is as efficient as possible; I just want some query to illustrate how the code may be structured.

var dataset = ...

var watch = Stopwatch.StartNew();

var query = dataset.Where( d => dataset.Count( i => i == d ) > 1 );

watch.Stop();  // timer stops here

foreach (var item in query) // query is actually evaluated here
{
   ... print out the item...
}

I would suggest that LINQ is only faster than a 'normal loop' when your algorithm is less than perfect (or you have some problem in your code). So LINQ will be faster at sorting than you are if you don't write an efficient sorting algorithm, etc.

LINQ is usually 'as fast as' or 'close enough to' the speed of a normal loop, and can be faster (and simpler) to code / debug / read. That's its benefit - not execution speed.

If it's performing faster than an empty loop, you are doing something wrong. Most likely, as suggested in comments, you aren't considering deferred execution and the LINQ statement is not actually executing.

If you did not compile with "Optimize Code" enabled, you would probably see this behaviour. (It would certainly explain why the empty loop was not removed.)

The code underlying LINQ, however, is part of already-compiled code, which will certainly have been optimised (by the JIT, NGen or similar).

继续阅读：algorithm linq performance sorting

Why can LINQ operations be faster than a normal loop?

更多精彩内容

精彩评论

最新问答

央视是哪个频道？

请问买过的朋友，舒提啦旅行箱实际使用体验如何？？

检查不孕不育需要的费用？

海信ULED电视画质有什么不同的地方?？

钉子可以挂的住画框幕布吗？

问答排行榜

河神2九牛入海钓河妖是第几集河妖什么来历可活吞牛？

性激素六项检查的最佳时间是多久？多少钱？？

Easiest way to get words of one line from istream into a vector?

《梦在燃烧 (《三国演义》动画片主题曲)》MP3歌词-汤子星？

抽烟只抽炫赫门？