Linq duplicate removal with a twist
I got a list that contains al the status items of each order. The problem that i have is that i need to remove all the items of which the status -> logdate combination is not the highest.
e.g
var inputs = new List<StatusItem>();
//note that the 3th id is simply a modifier that adds that amount of secs
//to the current datetime, to make testing easier
inputs.Add(new StatusItem(123, 30, 1));
inputs.Add(new StatusItem(123, 40, 2));
inputs.Add(new StatusItem(123, 50, 3));
inputs.Add(new StatusItem(123, 40, 4));
inputs.Add(new StatusItem(123, 50, 5));
inputs.Add(new StatusItem(100, 20, 6));
inputs.Add(new StatusItem(100, 30, 7));
inputs.Add(new StatusItem(100, 20, 8));
inputs.Add(new StatusItem(100, 30, 9));
inputs.Add(new StatusItem(开发者_Go百科100, 40, 10));
inputs.Add(new StatusItem(100, 50, 11));
inputs.Add(new StatusItem(100, 40, 12));
var l = from i in inputs
group i by i.internalId
into cg
select
from s in cg
group s by s.statusId
into sg
select sg.OrderByDescending(n => n.date).First()
;
edit: for convenience im adding the class definition as well.
public class StatusItem
{
public int internalId;
public int statusId;
public DateTime date;
public StatusItem(int internalId, int statusId, int secMod)
{
this.internalId = internalId;
this.statusId = statusId;
date = DateTime.Now.AddSeconds(secMod);
}
}
This creates a list that returnes me the following:
order 123 status 30 date 4/9/2010 6:44:21 PM
order 123 status 40 date 4/9/2010 6:44:24 PM order 123 status 50 date 4/9/2010 6:44:25 PMorder 100 status 20 date 4/9/2010 6:44:28 PM
order 100 status 30 date 4/9/2010 6:44:29 PM order 100 status 40 date 4/9/2010 6:44:32 PM order 100 status 50 date 4/9/2010 6:44:31 PMThis is ALMOST correct. However that last line which has status 50 needs to be filtered out as well because it was overruled by status 40 in the historylist. U can tell by the fact that its date is lower then the "last" status-item with the status 40.
I was hoping someone could give me some pointers because im stuck.
Edit: Final complete solution:
var k = from sg in
from i in inputs
group i by i.internalId
into cg
select
from s in cg
group s by s.statusId
into sg
select sg.OrderByDescending(n => n.date).First()
from s in sg
where s.date >= sg.Where(n => n.statusId <= s.statusId).Max(n => n.date)
group s by s.internalId
into si
from x in si
select x;
Looks like you don't currently have anything performing the filtering you need for the date, so you'd need to do something about that.
Off hand, something like this would perform the additional filtering:
var k = from sg in l
from s in sg
where s.date >= sg.Where(n => n.statusId <= s.statusId).Max(n => n.date)
group s by s.internalId;
Haven't tested it, so the grouping may not be what you want, and the comparisons may be reversed, but something like that should filter. >=
and <=
instead of >
or <
should mean that the status will always be compared to itself and not have to deal with empty set in aggregate issues.
It's not exactly in the same form you have, but it does give the correct result. I made a status item class with i, j, and k properties. Not sure what names you used for them.
var keys = inputs.Select(
input =>
new { i = input.i, j = input.j })
.Distinct();
var maxes = keys.Select(
ints =>
inputs.First(
input =>
input.i == ints.i
&& input.j == ints.j
&& input.k == inputs.Where(
i =>
i.i == ints.i
&& i.j == ints.j
).Select(i => i.k).Max()));
精彩评论