Self join to lowest occurrence of group
I have a problem in T-SQL that I find difficult to solve.
I have a table with groups of records, grouped by key1 and key2. I order each group chronologically by date. For each record, I want to see if there existed a record before (within the group and with lower date) for which the field "datafield" forms an allowed combination with the current record's "datafield". For the allowed combinations, I have a table called AllowedCombinationsTable.
I wrote following code to achieve it:
WITH Source AS (
SELECT key1, key2, datafield, date1,
ROW_NUMBER() OVER(PARTITION BY key1, key2 ORDER BY date1 ASC) AS dateorder
FROM table
)
SELECT L.key1, L.key2, L.datafield, DC.datafield2
FROM Source AS L
LEFT JOIN AllowedDataCombinationsTable DC
ON D.datafield1 = L.datafield
LEFT JOIN Source AS R
ON R.Key1 = L.Key1
AND R.Key2 = L.Key2
AND R.dateorder < L.dateorder
AND DC.datafield2 = L.datafield
-- AND "pick the one record with lowest dateorder"
Now for each of these possible combination records, I want to pick the first one (see placeholder in code). How can I do it most efficiently?
EDIT: OK let's say for the source, only showing group (1, 1):
**Key1 Key2 Datafield Date DateOrder**
1 1 "Horse" 1-Jan-2010 1
1 1 "Horse" 2-Jan-2010 2
1 1 "Sheep" 3-Jan-2010 3
1 1 "Dog" 4-Jan-2010 4
1 1 "Cat" 5-Jan-2010 5
AllowedCombinationsTable:
**Datafield1 Datafield**
Cat Sheep (and Sheep Cat)
Cat Horse (and Horse Cat)
Dog Horse (and Horse Dog)
After my join I have now:
**Key1 Key2 Datafield Date DateOrder JoinedCombination JoinedCombinationDateOrder**
1 1 "Horse" 1-Jan-2010 1 NULL NULL
1 1 "Horse" 2-Jan-2010 2 NULL NULL
1 1 "Sheep" 3-Jan-2010 3 NULL NULL
1 1 "Dog" 4-Jan-2010 4 "Horse" 1
1 1 "Dog" 4-Jan-2010 4 "Horse" 2
1 1 "Cat" 5-Jan-2010 5 "Horse" 1
1 1 "Cat" 5-Jan-2010 5 "Horse" 2
1 1 "Cat" 5-Jan-2010 5 "Sheep" 3
I want to display only the first "Horse" for record 4 "Dog", and also only the first开发者_如何转开发 "Horse" for record 5 "Cat".
Get it? ;)
I think this may do it--don't have data set up to test the query with. Check the comments for rationale.
WITH Source AS (
SELECT key1, key2, datafield, date1,
ROW_NUMBER() OVER(PARTITION BY key1, key2 ORDER BY date1 ASC) AS dateorder
FROM table
)
SELECT L.key1, L.key2, L.datafield, DC.datafield2
FROM Source AS L
LEFT JOIN AllowedDataCombinationsTable DC
ON DC.datafield1 = L.datafield -- DC Alias
LEFT JOIN Source AS R
ON R.Key1 = L.Key1
AND R.Key2 = L.Key2
AND DC.datafield2 = R.datafield -- Changed alias from L to R
AND R.dateorder = 1 -- Pick out lowest one
AND R.dateorder < L.dateorder -- Make sure it's not the same one
Well, I don't use WITH
or OVER
, so this is a different approach.. I might be over-simplifying something, but without having the data in front of me this is what I came up with:
SELECT distinct a.Key1, a.Key2, a.Datafield,
ISNULL(b.Datafield,'') as Datafield1,
ISNULL(b.Date,a.Date) as `Date`,
MIN(a.DateOrder) as DateOrder
FROM Source a
LEFT JOIN Source b
ON a.Key1 = b.Key1
AND a.Key2 = b.Key2
AND a.Dateorder <> b.Dateorder
LEFT JOIN AllowedDataCombinationsTable c
ON a.Datafield = c.Datafield
AND b.Datafield = c.Datafield1
GROUP BY a.Key1, a.Key2, a.Datafield, ISNULL(b.Datafield,''), ISNULL(b.Date,a.Date)
精彩评论