Filtering subsets using Linq
Imagine a have a very long enunumeration, too big to reasonably 开发者_如何学运维convert to a list. Imagine also that I want to remove duplicates from the list. Lastly imagine that I know that only a small subset of the initial enumeration could possibly contain duplicates. The last point makes the problem practical.
Basically I want to filter out the list based on some predicate and only call Distinct() on that subset, but also recombine with the enumeration where the predicate returned false.
Can anyone think of a good idiomatic Linq way of doing this? I suppose the question boils down to the following:
With Linq how can you perform selective processing on a predicated enumeration and recombine the result stream with the rejected cases from the predicate?
You can do it by traversing the list twice, once to apply the predicate and dedup, and a second time to apply the negation of the predicate. Another solution is to write your own variant of the Where
extension method that pushes non-matching entries into a buffer on the side:
IEnumerable<T> WhereTee(this IEnumerable<T> input, Predicate<T> pred, List<T> buffer)
{
foreach (T t in input)
{
if (pred(t))
{
yield return t;
}
else
{
buffer.Add(t);
}
}
}
Can you give a little more details on how you would like to recombine the elments.
One way i can think of solving this problem is by using the Zip operator of .Net 4.0 like this.
var initialList = new List<int>();
var resjectedElemnts = initialList.Where( x=> !aPredicate(x) );
var accepetedElements = initialList.Where( x=> aPredicate(x) );
var result = accepetedElements.Zip(resjectedElemnts,(accepted,rejected) => T new {accepted,rejected});
This will create a list of pair of rejected and accepeted elements. But the size of the list will be contrained by the shorter list between the two inputs.
精彩评论