Linq to Objects: filtering performance question

2023-01-19 23:05 问答作者：

I was thinking about the way linq computes and it made me wonder:

If I write

var count = collection.Count(o => o.Category == 3);

Will that perform any differently than:

var count = collection.Where(o => o.Category开发者_运维技巧 == 3).Count();

After all, IEnumerable<T>.Where() will return IEnumerable<T> which doesn't implement Count property, so a subsequent Count() would actually have to walk through the items to determine the count which should cause extra time being spent on this.

I wrote some quick test code to get some metrics but they seem to beat each other at random. I won't put in the test code here initially, but if anyone requests, I'll get it in.

So, am I missing something?

There won't be a lot in it, really - both forms will iterate over the collection, check the predicate against each item, and count the matches. Both approaches will stream the data - it's not like Where is actually building an in-memory list of all matches, for example.

The first form has one fewer (thin) layer of indirection in, that's all. The main reason for using it (IMO) is for readability/simplicity, rather than performance.

As Jon Skeet says, both techniques will have to essentially do the same thing - enumerate the sequence while conditionally incrementing a counter when the predicate is matched. Any performance differences between the two should be slight: insignificant for almost all use-cases. If there is a token winner though, I would think it should be the first one, since from reflector it appears that the overload ofCountthat takes a predicate uses its ownforeachto enumerate rather than the more obvious way of offloading the work to a streaming aWhereinto a parameterlessCountas in your second example. This means technique #1 is likely to have two minor performance benefits:

Fewer argument validation (null-tests etc.) checks. Technique #2's Count will also check if its (piped) input is an ICollection or ICollection<T> , which it can't possibly be.
A single constructed enumerator vs two enumerators piped together (an additional state-machine has costs).

There is one minor in favour of technique #2 point though:Whereis slightly more sophisticated in constructing an enumerator for the source-sequence; it uses a different one for lists and arrays. This may make it more performant in certain scenarios.

Of course, I should reiterate that I might be plain wrong about my analysis - reasoning about performance through static code analysis, especially when the differences are likely to be slight, is not a good idea. There is only one way to find out - measuring the execution times for your specific setup.

FYI, the source I reflected was from .NET 3.5 SP1.

I know what you are thinking here. At least, I think I do; Count() will look to see if Count is available as a property, and will simply return that if so. Otherwise, it has to enumerate the items to get its return value.

The version of Count() which accepts the predicate, though, will always cause the collection to be iterated, since it has to do it to see which ones match.

Above answers make good points, consider also that if you break away into any Linq-To-X implementations that deferred execution (Linq to Sql being the primary), the Expression parameters used in these methods may cause different results.

继续阅读：linq linq-to-objects performance

Linq to Objects: filtering performance question

更多精彩内容

精彩评论

最新问答

央视是哪个频道？

请问买过的朋友，舒提啦旅行箱实际使用体验如何？？

检查不孕不育需要的费用？

海信ULED电视画质有什么不同的地方?？

钉子可以挂的住画框幕布吗？

问答排行榜

河神2九牛入海钓河妖是第几集河妖什么来历可活吞牛？

性激素六项检查的最佳时间是多久？多少钱？？

Easiest way to get words of one line from istream into a vector?

《梦在燃烧 (《三国演义》动画片主题曲)》MP3歌词-汤子星？

抽烟只抽炫赫门？