Group by element in Linq
Lets assume we have the following array
var arr = new string[] {"foo","bar","jar","\r","a","b,"c","\r","x","y","z","\r");
开发者_开发问答
Also ignore the fact that this is strings, so no string hack solutions please.
I want to group these elements by each "\r" in the sequence. That is, I want one array/enumerable with "foo","bar","jar" and another with "a","b","c" etc.
Is there anything in the ienumerable extensions that will let me do this or will I have to roll my own group by method here?
I wrote an extension method for this purpose which works on any IEnumerable<T>
.
/// <summary>
/// Splits the specified IEnumerable at every element that satisfies a
/// specified predicate and returns a collection containing each sequence
/// of elements in between each pair of such elements. The elements
/// satisfying the predicate are not included.
/// </summary>
/// <param name="splitWhat">The collection to be split.</param>
/// <param name="splitWhere">A predicate that determines which elements
/// constitute the separators.</param>
/// <returns>A collection containing the individual pieces taken from the
/// original collection.</returns>
public static IEnumerable<IEnumerable<T>> Split<T>(
this IEnumerable<T> splitWhat, Func<T, bool> splitWhere)
{
if (splitWhat == null)
throw new ArgumentNullException("splitWhat");
if (splitWhere == null)
throw new ArgumentNullException("splitWhere");
return splitIterator(splitWhat, splitWhere);
}
private static IEnumerable<IEnumerable<T>> splitIterator<T>(
IEnumerable<T> splitWhat, Func<T, bool> splitWhere)
{
int prevIndex = 0;
foreach (var index in splitWhat
.Select((elem, ind) => new { e = elem, i = ind })
.Where(x => splitWhere(x.e)))
{
yield return splitWhat.Skip(prevIndex).Take(index.i - prevIndex);
prevIndex = index.i + 1;
}
yield return splitWhat.Skip(prevIndex);
}
For example, in your case, you can use it like this:
var arr = new string[] { "foo", "bar", "jar", "\r", "a", "b", "c", "\r", "x", "y", "z", "\r" };
var results = arr.Split(elem => elem == "\r");
foreach (var result in results)
Console.WriteLine(string.Join(", ", result));
This will print:
foo, bar, jar
a, b, c
x, y, z
(including a blank line at the end, because there is a "\r"
at the end of your collection).
If you want to use a standard IEnumerable
extension method, you'd have to use Aggregate
(but this is not as reusable as Timwi's solution):
var list = new[] { "foo","bar","jar","\r","a","b","c","\r","x","y","z","\r" };
var res = list.Aggregate(new List<List<string>>(),
(l, s) =>
{
if (s == "\r")
{
l.Add(new List<string>());
}
else
{
if (!l.Any())
{
l.Add(new List<string>());
}
l.Last().Add(s);
}
return l;
});
See this nest yields to return IEnumerable<IEnumerable<T>> with lazy evaluation too. You can have a SplitBy
extension method that accepts a predicate to split:
public static IEnumerable<IList<T>> SplitBy<T>(this IEnumerable<T> source,
Func<T, bool> separatorPredicate,
bool includeEmptyEntries = false,
bool includeSeparators = false)
{
var l = new List<T>();
foreach (var x in source)
{
if (!separatorPredicate(x))
l.Add(x);
else
{
if (includeEmptyEntries || l.Count != 0)
{
if (includeSeparators)
l.Add(x);
yield return l;
}
l = new List<T>();
}
}
if (l.Count != 0)
yield return l;
}
So in your case:
var arr = new string[] {"foo","bar","jar","\r","a","b,"c","\r","x","y","z","\r");
foreach (var items in arr.SplitBy(x => x == "\r"))
foreach (var item in items)
{
}
Same as Timwi's, implemented differently. No error checking, thats upto u. This is going to be faster since you're traversing the list only once.
精彩评论