开发者

What is the difference between HashSet<T> and List<T>?

Can you explain what is the difference between HashSet<T> and List<T> in .NET?

Maybe you can explain with an example in what cases HashSet<T> should be preferred against L开发者_如何转开发ist<T> ?


Unlike a List<> ...

  1. A HashSet is a List with no duplicate members.

  2. Because a HashSet is constrained to contain only unique entries, the internal structure is optimised for searching (compared with a list) - it is considerably faster

  3. Adding to a HashSet returns a boolean - false if addition fails due to already existing in Set

  4. Can perform mathematical set operations against a Set: Union/Intersection/IsSubsetOf etc.

  5. HashSet doesn't implement IList only ICollection

  6. You cannot use indices with a HashSet, only enumerators.

The main reason to use a HashSet would be if you are interested in performing Set operations.

Given 2 sets: hashSet1 and hashSet2

 //returns a list of distinct items in both sets
 HashSet set3 = set1.Union( set2 );

flies in comparison with an equivalent operation using LINQ. It's also neater to write!


To be more precise lets demonstrate with examples,

You can not use HashSet like in the following example.

HashSet<string> hashSet1 = new HashSet<string>(){"1","2","3"};
for (int i = 0; i < hashSet1.Count; i++)
    Console.WriteLine(hashSet1[i]);

hashSet1[i] would produce an error:

Cannot apply indexing with [] to an expression of type 'System.Collections.Generic.HashSet'

You can use foreach statement:

foreach (var item in hashSet1)
    Console.WriteLine(item);

You can not add duplicated items to HashSet while List allows you to do this and while you are adding an item to HashSet, you can check wheter it contains the item or not.

HashSet<string> hashSet1 = new HashSet<string>(){"1","2","3"};
if (hashSet1.Add("1"))
   Console.WriteLine("'1' is successfully added to hashSet1!");
else
   Console.WriteLine("'1' could not be added to hashSet1, because it contains '1'");

HashSet has some useful functions like IntersectWith, UnionWith, IsProperSubsetOf, ExceptWith, SymmetricExceptWith etc.

IsProperSubsetOf:

HashSet<string> hashSet1 = new HashSet<string>() { "1", "2", "3", "4" };
HashSet<string> hashSet2 = new HashSet<string>() { "2", "4", "6", "8" };
HashSet<string> hashSet3 = new HashSet<string>() { "1", "2", "3", "4", "5" };
if (hashSet1.IsProperSubsetOf(hashSet3))
    Console.WriteLine("hashSet3 contains all elements of hashSet1.");
if (!hashSet1.IsProperSubsetOf(hashSet2))
    Console.WriteLine("hashSet2 does not contains all elements of hashSet1.");

UnionWith:

HashSet<string> hashSet1 = new HashSet<string>() { "3", "4" };
HashSet<string> hashSet2 = new HashSet<string>() { "2", "4", "6", "8" };
hashSet1.UnionWith(hashSet2); //hashSet1 -> 3, 2, 4, 6, 8

IntersectWith:

HashSet<string> hashSet1 = new HashSet<string>() { "3", "4", "8" };
HashSet<string> hashSet2 = new HashSet<string>() { "2", "4", "6", "8" }
hashSet1.IntersectWith(hashSet2);//hashSet1 -> 4, 8

ExceptWith :

 HashSet<string> hashSet1 = new HashSet<string>() { "1", "2", "3", "5", "6" };
 HashSet<string> hashSet2 = new HashSet<string>() { "1", "2", "3", "4" };
 hashSet1.ExceptWith(hashSet2);//hashSet1 -> 5, 6

SymmetricExceptWith :

 HashSet<string> hashSet1 = new HashSet<string>() { "1", "2", "3", "5", "6" };
 HashSet<string> hashSet2 = new HashSet<string>() { "1", "2", "3", "4" };
 hashSet1.SymmetricExceptWith(hashSet2);//hashSet1 -> 4, 5, 6

By the way, the order is not preserved in HashSets. In the example, we added element "2" last but it is in the second order:

HashSet<string> hashSet1 = new HashSet<string>() { "3", "4", "8" };
hashSet1.Add("1");    // 3, 4, 8, 1
hashSet1.Remove("4"); // 3, 8, 1
hashSet1.Add("2");    // 3, 2 ,8, 1


A HashSet<T> is a class designed to give you O(1) lookup for containment (i.e., does this collection contain a particular object, and tell me the answer fast).

A List<T> is a class designed to give you a collection with O(1) random access than can grow dynamically (think dynamic array). You can test containment in O(n) time (unless the list is sorted, then you can do a binary search in O(log n) time).

Maybe you can explain with an example in what cases HashSet<T> should be prefered against List<T>

When you want to test containment in O(1).


Use a List<T> when you want to:

  • Store a collection of items in a certain order.

If you know the index of the item you want (rather than the value of the item itself) retrieval is O(1). If you don't know the index, finding the item takes more time, O(n) for an unsorted collection.

Use a Hashset<T> when you want to:

  • Quickly find out if a certain object is contained in a collection.

If you know the name of the thing you want to find, Lookup is O(1) (that's the 'Hash' part). It doesn't maintain an ordering like the List<T> does and you can't store duplicates (adding a duplicate has no effect, that's the 'Set' part).

An example of when to use a Hashset<T> would be if you want to find out if a word played in a game of Scrabble is a valid word in English (or other language). Even better would be if you wanted to build a web service to be used by all instances of an online version of such a game.

A List<T> would be a good data structure for creating the scoreboard to track player scores.


List is an ordered list. It is

  • accessed by an integer index
  • can contain duplicates
  • has a predictable order

HashSet is a set. It:

  • Can block duplicate items (see Add(T))
  • Does not guarantee the order of the items within the set
  • Has operations you would expect on a set, e.g., IntersectWith, IsProperSubsetOf, UnionWith.

List is more appropriate when you want to access you collection as though it were like an array to which you could append, insert and remove items. HashSet is a better choice if you want to treat your collection like a "bag" of items in which order is not important or when you want to compare it with other sets using the operations such as IntersectWith or UnionWith.


List is not necessarily unique, while hashset is, for one.


A List is an ordered collection of objects of Type T that unlike an array you can add and remove entries.

You would use a list where you want to reference the members in the order you stored them and you are accessing them by an position rather than the item itself.

A HashSet is like a dictionary that the item itself is the key as well as the value, the ordering is not guaranteed.

You would use a HashSet where you want to check that an object is in the collection


If you decide to apply these data structures to actual usage in data-driven development, a HashSet is VERY helpful in testing replication against data adapter sources, for data cleansing and migration.

Also, if using the DataAnnotations Class, one can implement Key logic on class properties and effectively control a Natural Index (clustered or not) with a HashSet, where this would be very difficult in a List implementation.

A strong option for using a list is to implement generics for multiple mediums on a View Model, such as sending a list of classes to a MVC View for a DropDownList Helper, and also for sending as a JSON construct via WebApi. The list allows typical class collection logic, and keeps flexibility for a more "Interface" like approach to computing a single view model to different mediums.

0

上一篇:

下一篇:

精彩评论

暂无评论...
验证码 换一张
取 消

最新问答

问答排行榜