Splitting a deferred IEnumerable<T> into two sequences without re-evaluation?
I have a method that needs to process an incoming sequence of commands and split the results into different buckets depending on some properties of the result. For example:
class Pets
{
public IEnumerable<Cat> Cats { get; set; }
public IEnumerable<Dog> Dogs { get; set; }
}
Pets GetPets(IEnumerable<PetRequest> requests) { ... }
The underlying model is perfectly capable of handling the entire sequence of PetRequest
elements at once, and also the PetRequest
is mostly generic information like an ID, so it makes no sense to try to split the requests at the input. But the provider doesn't actually give back Cat
and Dog
instances, just a generic data structure:
class PetProvider
{
IEnumerable<PetData> GetPets(IEnumerable<PetRequest> requests)
{
return HandleAllRequests(requests);
}
}
I've named the response type PetData
instead of Pet
to clearly indicate that it is not a superclass of Cat
or Dog
- in other words, conversion to Cat
or Dog
is a mapping process. The other thing to keep in mind is that HandleAllRequests
is expensive, e.g. a database query, so I really don't want to repeat it, and I would prefer to avoid caching the results in memory using ToArray()
or the like, because there might be thousands or millions of results (I have a lot of pets).
So far I've been able to throw together this clumsy hack:
Pets GetPets(IEnumerable<PetRequest> requests)
{
var data = petProvider.GetPets(requests);
var dataGroups =
from d in data
group d by d.Sound into g
select new { Sound = g.Key, PetData = g };
IEnumerable<Cat> cats = null;
IEnumerable<Dog> dogs = null;
foreach (var g in dataGroups)
if (g.Sou开发者_StackOverflow社区nd == "Bark")
dogs = g.PetData.Select(d => ConvertDog(d));
else if (g.Sound == "Meow")
cats = g.PetData.Select(d => ConvertCat(d));
return new Pets { Cats = cats, Dogs = dogs };
}
This technically works, in the sense that it doesn't cause the PetData
results to be enumerated twice, but it has two major problems:
It looks like a giant pimple on the code; it smacks of the awful imperative style we always used to have to employ in the pre-LINQ framework 2.0.
It ends up being a thoroughly pointless exercise, because the
GroupBy
method is just caching all those results in memory, which means I'm really no better off than if I'd just been lazy and done aToList()
in the first place and attached a few predicates.
So to restate the question:
Is it possible to split a single deferred IEnumerable<T>
instance into two IEnumerable<?>
instances, without performing any eager evaluations, caching results in memory, or having to re-evaluate the original IEnumerable<T>
a second time?
Basically, this would be the reverse of a Concat
operation. The fact that there isn't already one in the .NET framework is a strong indication that this may not even be possible, but I thought it wouldn't hurt to ask anyway.
P.S. Please don't tell me to create a Pet
superclass and just return an IEnumerable<Pet>
. I used Cat
and Dog
as fun examples, but in reality the result types are more like Item
and Error
- they are both derived from the same generic data but otherwise have nothing in common at all.
Fundamentally, no. Imagine if it were possible. Then consider what happens if I do:
foreach (Cat cat in pets.Cats)
{
...
}
foreach (Dog dog in pets.Dogs)
{
...
}
That needs to handle all the cats first, and then all the dogs... so what could happen with the original sequence if the first element is a Dog
? It either has to cache it or skip it - it can't return it, because we're still asking for Cats
.
You could implement something which only caches as much as it needs to, but that's likely to be the whole of one sequence, as typical usage is to completely evaluate one sequence or the other.
If at all possible, you really just want to handle pets (whether cats or dogs) as you fetch them. Would it be feasible to provide an Action<Cat>
and an Action<Pet>
and execute the right handler for each item?
What Jon said (I'm sure I'm the 1 millionth person to say that).
I'd probably just go old-school and do:
List<Cat> cats = new List<Cat>();
List<Dog> dog = new List<Dog>();
foreach(var pet in data)
{
if (g.Sound == "Bark")
dogs.Add(ConvertDog(pet));
else if (pet.Sound == "Meow")
cats.Add(ConvertCat(pet));
}
But I realise this is not exactly what you want to do - but then you did say re-evaluation - and this does only evaluate once :)
精彩评论