Zip N IEnumerable<T>s together? Iterate over them simultaneously?

2023-01-21 22:46 问答作者：

I have:-

IEnumerable<IEnumerable<T>> items;

and I'd like to create:-

IEnumerable<IEnumerable<T>> results;

where the first item in "results" is an IEnumerable of the first item of each of the IEnumerables of "items", the second item in "results" is an IEnumerable of the second item of each of "items", etc.

The IEnumerables aren't necessarily the same lengths. If some of the IEnumerables in items don't have an element at a particular index, then I'd expect the matching IEnumerable in results to have fewer items in it.

For example:-

items = { "1", "2", "3", "4" } , { "a", "b", "c" };
results = { "1", "a" } , { "2", "b" }, { "3"开发者_运维技巧, "c" }, { "4" };

Edit: Another example (requested in comments):-

items = { "1", "2", "3", "4" } , { "a", "b", "c" }, { "p", "q", "r", "s", "t" };
results = { "1", "a", "p" } , { "2", "b", "q" }, { "3", "c", "r" }, { "4", "s" }, { "t" };

I don't know in advance how many sequences there are, nor how many elements are in each sequence. I might have 1,000 sequences with 1,000,000 elements in each, and I might only need the first ~10, so I'd like to use the (lazy) enumeration of the source sequences if I can. In particular I don't want to create a new data structure if I can help it.

Is there a built-in method (similar to IEnumerable.Zip) that can do this?

Is there another way?

Now lightly tested and with working disposal.

public static class Extensions
{
  public static IEnumerable<IEnumerable<T>> JaggedPivot<T>(
    this IEnumerable<IEnumerable<T>> source)
  {
    List<IEnumerator<T>> originalEnumerators = source
      .Select(x => x.GetEnumerator())
      .ToList();

    try
    {
      List<IEnumerator<T>> enumerators = originalEnumerators
        .Where(x => x.MoveNext()).ToList();

      while (enumerators.Any())
      {
        List<T> result = enumerators.Select(x => x.Current).ToList();
        yield return result;
        enumerators = enumerators.Where(x => x.MoveNext()).ToList();
      }
    }
    finally
    {
      originalEnumerators.ForEach(x => x.Dispose());
    }
  } 
}

public class TestExtensions
{
  public void Test1()
  {
    IEnumerable<IEnumerable<int>> myInts = new List<IEnumerable<int>>()
    {
      Enumerable.Range(1, 20).ToList(),
      Enumerable.Range(21, 5).ToList(),
      Enumerable.Range(26, 15).ToList()
    };

    foreach(IEnumerable<int> x in myInts.JaggedPivot().Take(10))
    {
      foreach(int i in x)
      {
        Console.Write("{0} ", i);
      }
      Console.WriteLine();
    }
  }
}

It's reasonably straightforward to do if you can guarantee how the results are going to be used. However, if the results might be used in an arbitrary order, you may need to buffer everything. Consider this:

var results = MethodToBeImplemented(sequences);
var iterator = results.GetEnumerator();
iterator.MoveNext();
var first = iterator.Current;
iterator.MoveNext();
var second = iterator.Current;
foreach (var x in second)
{
    // Do something
}
foreach (var x in first)
{
    // Do something
}

In order to get at the items in "second" you'll have to iterate over all of the subsequences, past the first items. If you then want it to be valid to iterate over the items in first you either need to remember the items or be prepared to re-evaluate the subsequences.

Likewise you'll either need to buffer the subsequences as IEnumerable<T> values or reread the whole lot each time.

Basically it's a whole can of worms which is difficult to do elegantly in a way which will work pleasantly for all situations :( If you have a specific situation in mind with appropriate constraints, we may be able to help more.

Based on David B's answer, this code should perform better:

public static IEnumerable<IEnumerable<T>> JaggedPivot<T>(
    this IEnumerable<IEnumerable<T>> source)
{
    var originalEnumerators = source.Select(x => x.GetEnumerator()).ToList();
    try
    {
        var enumerators =
            new List<IEnumerator<T>>(originalEnumerators.Where(x => x.MoveNext()));

        while (enumerators.Any())
        {
            yield return enumerators.Select(x => x.Current).ToList();
            enumerators.RemoveAll(x => !x.MoveNext());
        }
    }
    finally
    {
        originalEnumerators.ForEach(x => x.Dispose());
    }
}

The difference is that the enumerators variable isn't re-created all the time.

Here's one that is a bit shorter, but no doubt less efficient:

Enumerable.Range(0,items.Select(x => x.Count()).Max())
    .Select(x => items.SelectMany(y => y.Skip(x).Take(1)));

What about this?

        List<string[]> items = new List<string[]>()
        {
            new string[] { "a", "b", "c" },
            new string[] { "1", "2", "3" },
            new string[] { "x", "y" },
            new string[] { "y", "z", "w" }
        };

        var x = from i in Enumerable.Range(0, items.Max(a => a.Length))
                select from z in items
                       where z.Length > i
                       select z[i];

You could compose existing operators like this,

IEnumerable<IEnumerable<int>> myInts = new List<IEnumerable<int>>()
    {
        Enumerable.Range(1, 20).ToList(),
        Enumerable.Range(21, 5).ToList(),
        Enumerable.Range(26, 15).ToList()
    };

myInts.SelectMany(item => item.Select((number, index) => Tuple.Create(index, number)))
      .GroupBy(item => item.Item1)
      .Select(group => group.Select(tuple => tuple.Item2));

继续阅读：.net linq

Zip N IEnumerable<T>s together? Iterate over them simultaneously?

更多精彩内容

精彩评论

最新问答

央视是哪个频道？

请问买过的朋友，舒提啦旅行箱实际使用体验如何？？

检查不孕不育需要的费用？

海信ULED电视画质有什么不同的地方?？

钉子可以挂的住画框幕布吗？

问答排行榜

河神2九牛入海钓河妖是第几集河妖什么来历可活吞牛？

Best solution for private video database [closed]

国内夏季避暑旅游胜地有哪些？

王昌瑞《潜梦追凶》剧组庆生新锐演员未来可期？

Is it allowed to ask users to enter credit card details for own payment method?