How do I count a number of items in each group in sequence using linq?
For example, I have a sequence of integers
1122211121
I'd like to get some dictionary/anonymous class showing:
item | count
1 | 2
2 | 3
1 | 3开发者_如何学Python
2 | 1
1 | 1
var test = new[] { 1, 2, 2, 2, 2, 1, 1, 3 };
int previous = test.First();
int idx = 0;
test.Select(x =>
x == previous ?
new { orig = x, helper = idx } :
new { orig = previous = x, helper = ++idx })
.GroupBy(x => x.helper)
.Select(group => new { number = group.First().orig, count = group.Count() });
initialization of previous
and idx
could be done in let
clause if you want to be even more Linqy.
from whatever in new[] { "i want to use linq everywhere" }
let previous = test.First()
let idx = 0
from x in test
...
Functional programming is nice, but imho this is a case where in C# I would surely choose rather procedural approach.
You're looking to do something like the "Batch" operator in the morelinq project, then output the count of the groups.
Unfortunately, the batch operator from morelinq just takes a size and returns buckets batched by that size (or it did when I was looking at morelinq). To correct this deficiency, I had to write my own batch implementation.
private static IEnumerable<TResult> BatchImplementation<TSource, TResult>(
this IEnumerable<TSource> source,
Func<TSource, TSource, int, bool> breakCondition,
Func<IEnumerable<TSource>, TResult> resultSelector
)
{
List<TSource> bucket = null;
var lastItem = default(TSource);
var count = 0;
foreach (var item in source)
{
if (breakCondition(item, lastItem, count++))
{
if (bucket != null)
{
yield return resultSelector(bucket.Select(x => x));
}
bucket = new List<TSource>();
}
bucket.Add(item);
lastItem = item;
}
// Return the last bucket with all remaining elements
if (bucket.Count > 0)
{
yield return resultSelector(bucket.Select(x => x));
}
}
This is the private version that I expose various public overloads which validate input parameters. You would want your breakCondition to be something of the form:
Func<int, int, int, bool> breakCondition = x, y, z => x != y;
This should give you, for your example sequence: {1, 1}, {2, 2, 2}, {1, 1, 1}, {2}, {1}
From here, grabbing the first item of each sequence and then counting the sequence are trivial.
Edit: To assist in implementation -
public static IEnumerable<IEnumerable<TSource>> Batch<TSource>(
this IEnumerable<TSource> source,
Func<TSource, TSource, int, bool> breakCondition
)
{
//Validate that source, breakCondition, and resultSelector are not null
return BatchImplemenatation(source, breakCondition, x => x);
}
Your code would then be:
var sequence = {1, 1, 2, 2, 2, 1, 1, 1, 2, 1};
var batchedSequence = sequence.batch((x, y, z) => x != y);
//batchedSequence = {{1, 1}, {2, 2, 2}, {1, 1, 1}, {2}, {1}}
var counts = batchedSequence.Select(x => x.Count());
//counts = {2, 3, 3, 1, 1}
var items = batchedSequence.Select(x => x.First());
//items = {1, 2, 1, 2, 1}
var final = counts.Zip(items. (c, i) => {Item = i, Count = c});
I haven't compiled and tested any of this except the private method and its overloads that I use in my own codebase, but this should solve your problem and any similar problems you have.
Wel... a bit shorter (notice the double Separate call to deal with even/odd occurrences counts) :
static void Main(string[] args)
{
string separatedDigits = Separate(Separate("1122211121"));
foreach (var ano in separatedDigits.Split('|').Select(block => new { item = block.Substring(0, 1), count = block.Length }))
Console.WriteLine(ano);
Console.ReadKey();
}
static string Separate(string input)
{
return Regex.Replace(input, @"(\d)(?!\1)(\d)", "$1|$2");
}
}
精彩评论