LINQ Ring: Any() vs Contains() for Huge Collections
Given a huge collection of objects, is there a performance difference between the the following?
Collection.Contains:
myCollection.Contains(myElement)
开发者_运维技巧
Enumerable.Any:
myCollection.Any(currentElement => currentElement == myElement)
Contains()
is an instance method, and its performance depends largely on the collection itself. For instance, Contains()
on a List
is O(n), while Contains()
on a HashSet
is O(1).
Any()
is an extension method, and will simply go through the collection, applying the delegate on every object. It therefore has a complexity of O(n).
Any()
is more flexible however since you can pass a delegate. Contains()
can only accept an object.
It depends on the collection. If you have an ordered collection, then Contains
might do a smart search (binary, hash, b-tree, etc.), while with `Any() you are basically stuck with enumerating until you find it (assuming LINQ-to-Objects).
Also note that in your example, Any()
is using the ==
operator which will check for referential equality, while Contains
will use IEquatable<T>
or the Equals()
method, which might be overridden.
I suppose that would depend on the type of myCollection
is which dictates how Contains()
is implemented. If a sorted binary tree for example, it could search smarter. Also it may take the element's hash into account. Any()
on the other hand will enumerate through the collection until the first element that satisfies the condition is found. There are no optimizations for if the object had a smarter search method.
Contains() is also an extension method which can work fast if you use it in the correct way. For ex:
var result = context.Projects.Where(x => lstBizIds.Contains(x.businessId)).Select(x => x.projectId).ToList();
This will give the query
SELECT Id
FROM Projects
INNER JOIN (VALUES (1), (2), (3), (4), (5)) AS Data(Item) ON Projects.UserId = Data.Item
while Any() on the other hand always iterate through the O(n).
Hope this will work....
精彩评论