Splitting a deferred IEnumerable<T> into two sequences without re-evaluation?

2023-03-13 09:27 问答作者：

I have a method that needs to process an incoming sequence of commands and split the results into different buckets depending on some properties of the result. For example:

class Pets
{
    public IEnumerable<Cat> Cats { get; set; }
    public IEnumerable<Dog> Dogs { get; set; }
}

Pets GetPets(IEnumerable<PetRequest> requests) { ... }

The underlying model is perfectly capable of handling the entire sequence of PetRequest elements at once, and also the PetRequest is mostly generic information like an ID, so it makes no sense to try to split the requests at the input. But the provider doesn't actually give back Cat and Dog instances, just a generic data structure:

class PetProvider
{
    IEnumerable<PetData> GetPets(IEnumerable<PetRequest> requests)
    {
        return HandleAllRequests(requests);
    }
}

I've named the response type PetData instead of Pet to clearly indicate that it is not a superclass of Cat or Dog - in other words, conversion to Cat or Dog is a mapping process. The other thing to keep in mind is that HandleAllRequests is expensive, e.g. a database query, so I really don't want to repeat it, and I would prefer to avoid caching the results in memory using ToArray() or the like, because there might be thousands or millions of results (I have a lot of pets).

So far I've been able to throw together this clumsy hack:

Pets GetPets(IEnumerable<PetRequest> requests)
{
    var data = petProvider.GetPets(requests);
    var dataGroups = 
        from d in data
        group d by d.Sound into g
        select new { Sound = g.Key, PetData = g };
    IEnumerable<Cat> cats = null;
    IEnumerable<Dog> dogs = null;
    foreach (var g in dataGroups)
        if (g.Sou开发者_StackOverflow社区nd == "Bark")
            dogs = g.PetData.Select(d => ConvertDog(d));
        else if (g.Sound == "Meow")
            cats = g.PetData.Select(d => ConvertCat(d));
    return new Pets { Cats = cats, Dogs = dogs };
}

This technically works, in the sense that it doesn't cause the PetData results to be enumerated twice, but it has two major problems:

It looks like a giant pimple on the code; it smacks of the awful imperative style we always used to have to employ in the pre-LINQ framework 2.0.
It ends up being a thoroughly pointless exercise, because the GroupBy method is just caching all those results in memory, which means I'm really no better off than if I'd just been lazy and done a ToList() in the first place and attached a few predicates.

So to restate the question:

Is it possible to split a single deferred IEnumerable<T> instance into two IEnumerable<?> instances, without performing any eager evaluations, caching results in memory, or having to re-evaluate the original IEnumerable<T> a second time?

Basically, this would be the reverse of a Concat operation. The fact that there isn't already one in the .NET framework is a strong indication that this may not even be possible, but I thought it wouldn't hurt to ask anyway.

P.S. Please don't tell me to create a Pet superclass and just return an IEnumerable<Pet>. I used Cat and Dog as fun examples, but in reality the result types are more like Item and Error - they are both derived from the same generic data but otherwise have nothing in common at all.

Fundamentally, no. Imagine if it were possible. Then consider what happens if I do:

foreach (Cat cat in pets.Cats)
{
    ...
}

foreach (Dog dog in pets.Dogs)
{
    ...
}

That needs to handle all the cats first, and then all the dogs... so what could happen with the original sequence if the first element is a Dog? It either has to cache it or skip it - it can't return it, because we're still asking for Cats.

You could implement something which only caches as much as it needs to, but that's likely to be the whole of one sequence, as typical usage is to completely evaluate one sequence or the other.

If at all possible, you really just want to handle pets (whether cats or dogs) as you fetch them. Would it be feasible to provide an Action<Cat> and an Action<Pet> and execute the right handler for each item?

What Jon said (I'm sure I'm the 1 millionth person to say that).

I'd probably just go old-school and do:

List<Cat> cats = new List<Cat>();
List<Dog> dog = new List<Dog>();

foreach(var pet in data)
{
   if (g.Sound == "Bark")
     dogs.Add(ConvertDog(pet));
   else if (pet.Sound == "Meow")
     cats.Add(ConvertCat(pet));
}

But I realise this is not exactly what you want to do - but then you did say re-evaluation - and this does only evaluate once :)

继续阅读：.net ienumerable iterator linq

Splitting a deferred IEnumerable<T> into two sequences without re-evaluation?

更多精彩内容

精彩评论

最新问答

央视是哪个频道？

请问买过的朋友，舒提啦旅行箱实际使用体验如何？？

检查不孕不育需要的费用？

海信ULED电视画质有什么不同的地方?？

钉子可以挂的住画框幕布吗？

问答排行榜

河神2九牛入海钓河妖是第几集河妖什么来历可活吞牛？

性激素六项检查的最佳时间是多久？多少钱？？

Easiest way to get words of one line from istream into a vector?

《梦在燃烧 (《三国演义》动画片主题曲)》MP3歌词-汤子星？

抽烟只抽炫赫门？