Querying for a unique value based on the aggregate of another value while grouping on a third value entirely
So I know this problem isn't a new one, but I'm trying to wrap my head around it and understand the best way to deal with scenarios like this.
Say I have a hypothetical table 'X' that looks like this:
GroupID ID (identity) SomeDateTime
--------------------------------------------
1 1000 1/1/01
1 1001 2/2/02
1 1002 3/3/03
2 1003 4/4/04
2 1004 5/5/05
I want to query it so the result set looks like this:
----------------------------------------
1 1002 3/3/03
2 1004 5/5/05
Basically what I want is the MAX SomeDateTime
value grouped by my GroupID
column. The kicker is that I DON'T want to group by the ID
column,
I just want to know the 'ID' that corresponds to the MAX SomeDateTime
开发者_Python百科.
I know one pseudo-solution would be:
;WITH X1 as (
SELECT MAX(SomeDateTime) as SomeDateTime, GroupID
FROM X
GROUP BY GroupID
)
SELECT X1.SomeDateTime, X1.GroupID, X2.ID
FROM X1
INNER JOIN X as X2
ON X.DateTime = X2.DateTime
But this doesn't solve the fact that a DateTime might not be unique. And it seems sloppy to join on a DateTime like that.
Another pseudo-solution could be:
SELECT X.GroupID, MAX(X.ID) as ID, MAX(X.SomeDateTime) as SomeDateTime
FROM X
GROUP BY X.GroupID
But there are no guarantees that ID
will actually match the row that SomeDateTime
comes from.
A third less useful option might be:
SELECT TOP 1 X.GroupID, X.ID, X.SomeDateTime
FROM X
WHERE X.GroupID = 1
ORDER BY X.SomeDateTime DESC
But obviously that only works with a single, known, GroupID
. I want to be able to join this result set on GroupID
and/or ID
.
Does anyone know of any clever solutions? Any good uses of windowing functions?
Thanks!
I think this will do what you want.
;WITH X1
AS
(
SELECT SomeDateTime
,GroupID
,ID
,ROW_NUMBER() OVER (PARTITION BY GroupID
ORDER BY SomeDateTime DESC
) AS rn
FROM X
)
SELECT SomeDateTime
,GroupID
,ID
FROM X1
WHERE rn = 1
Your first solution is the correct one. I'd have written is as an inner select (note, syntax in innnersleect is ifferent by dbms and I'm used to db2 these days, so that's what you get :-):
Select g.groupid, g.id, g.somedatetime
from x g, (select s.groupid, max(s.somedatetime) from x s group by s.groupid) si
where g.groupid=s.groupid and g.somedatetime=si.somedatetime;
But, as you noted if somedatetime isn't unique for an id, then you 'll get multiple rows for each such group id. to eliminate that, you'll need an additional sub select to get the max(id) for the max(somedatetime). Very messy:
Select g.groupid, g.id, g.somedatetime
from x g, (select s.groupid, max(s.somedatetime) as maxdate from x s group by s.groupid) si,
(select i.groupid, max(i.id) as maxid, i.somedatetime from x i where i.groupid=si.groupid and i.somedatetime=si.maxdate group by i.groupid, i.somedatetime) si2
where g.groupid=s.groupid and g.id=maxid and g.somedatetime=maxdate;
You can also use a combination of the cross apply function and the Top(1) to solve this
select
distinct(GroupId)
,maxvales.SomeDateTime
from
X as outerX
cross apply
(select top(1) SomeDateTime from X as innerX where innerX.GroupID = outerX.GroupID order by SomeDateTime desc) as maxvales
精彩评论