IEnumerable<IEnumerable<int>> - no duplicate IEnumerable<int>s
I'm trying to find a solution to this problem:
Given a IEnumerable< IEnumerable< int>> I need a method/algorithm that returns the input, but in case of several IEnmerable< int> with the same elements only one per coincidence/group is returned.
ex.
IEnumerable<IEnumerable<int开发者_JS百科>> seqs = new[]
{
new[]{2,3,4}, // #0
new[]{1,2,4}, // #1 - equals #3
new[]{3,1,4}, // #2
new[]{4,1,2} // #3 - equals #1
};
"foreach seq in seqs" .. yields {#0,#1,#2} or {#0,#2,#3}
Sould I go with ..
.. some clever IEqualityComparer
.. some clever LINQ combination I havent figured out - groupby, sequenceequal ..?
.. some seq->HashSet stuff
.. what not. Anything will help
I'll be able to solve it by good'n'old programming but inspiration is always appreciated.
Here's a slightly simpler version of digEmAll's answer:
var result = seqs.Select(x => new HashSet<int>(x))
.Distinct(HashSet<int>.CreateSetComparer());
Given that you want to treat the elements as sets, you should have them that way to start with, IMO.
Of course this won't help if you want to maintain order within the sequences that are returned, you just don't mind which of the equal sets is returned... the above code will return an IEnumerable<HashSet<int>>
which will no longer have any ordering within each sequence. (The order in which the sets are returned isn't guaranteed either, although it would be odd for them not to be return in first-seen-first-returned basis.)
It feels unlikely that this wouldn't be enough, but if you could give more details of what you really need to achieve, that would make it easier to help.
As noted in comments, this will also assume that there are no duplicates within each original source array... or at least, that they're irrelevant, so you're happy to treat { 1 } and { 1, 1, 1, 1 } as equal.
Use the correct collection type for the job. What you really want is ISet<IEnumerable<int>>
with an equality comparer that will ignore the ordering of the IEnumerable
s.
EDITED:
You can get what you want by building your own IEqualityComparer<IEnumerable<int>>
e.g.:
public class MyEqualityComparer : IEqualityComparer<IEnumerable<int>>
{
public bool Equals(IEnumerable<int> x, IEnumerable<int> y)
{
return x.OrderBy(el1 => el1).SequenceEqual(y.OrderBy(el2 => el2));
}
public int GetHashCode(IEnumerable<int> elements)
{
int hash = 0;
foreach (var el in elements)
{
hash = hash ^ el.GetHashCode();
}
return hash;
}
}
Usage:
var values = seqs.Distinct(new MyEqualityComparer()).ToList();
N.B.
this solution is slightly different from the one given by Jon Skeet.
His answer considers sublists as sets, so basically two lists like [1,2]
and [1,1,1,2,2]
are equal.
This solution don't, i.e. :
[1,2,1,1]
is equal to [2,1,1,1]
but not to [2,2,1,1]
, hence basically the two lists have to contain the same elements and in the same number of occurrences.
精彩评论