开发者

Self join to lowest occurrence of group

I have a problem in T-SQL that I find difficult to solve.

I have a table with groups of records, grouped by key1 and key2. I order each group chronologically by date. For each record, I want to see if there existed a record before (within the group and with lower date) for which the field "datafield" forms an allowed combination with the current record's "datafield". For the allowed combinations, I have a table called AllowedCombinationsTable.

I wrote following code to achieve it:

WITH Source AS (
    SELECT key1, key2, datafield, date1,
        ROW_NUMBER() OVER(PARTITION BY key1, key2 ORDER BY date1 ASC) AS dateorder
        FROM table
)
SELECT L.key1, L.key2, L.datafield, DC.datafield2
FROM Source AS L
LEFT JOIN AllowedDataCombinationsTable DC
    ON D.datafield1 = L.datafield
LEFT JOIN Source AS R
    ON R.Key1 = L.Key1
    AND R.Key2 = L.Key2
    AND R.dateorder < L.dateorder
    AND DC.datafield2 = L.datafield
    -- AND "pick the one record with lowest dateorder"

Now for each of these possible combination records, I want to pick the first one (see placeholder in code). How can I do it most efficiently?


EDIT: OK let's say for the source, only showing group (1, 1):

**Key1 Key2 Datafield Date DateOrder**
1 1 "Horse" 1-Jan-2010 1
1 1 "Horse" 2-Jan-2010 2
1 1 "Sheep" 3-Jan-2010 3
1 1 "Dog" 4-Jan-2010 4
1 1 "Cat" 5-Jan-2010 5

AllowedCombinationsTable:

**Datafield1 Datafield**
Cat Sheep (and Sheep Cat)
Cat Horse (and Horse Cat)
Dog Horse (and Horse Dog)

After my join I have now:

**Key1 Key2 Datafield Date DateOrder JoinedCombination JoinedCombinationDateOrder**
1 1 "Horse" 1-Jan-2010 1 NULL NULL
1 1 "Horse" 2-Jan-2010 2 NULL NULL
1 1 "Sheep" 3-Jan-2010 3 NULL NULL
1 1 "Dog" 4-Jan-2010 4 "Horse" 1
1 1 "Dog" 4-Jan-2010 4 "Horse" 2
1 1 "Cat" 5-Jan-2010 5 "Horse" 1
1 1 "Cat" 5-Jan-2010 5 "Horse" 2
1 1 "Cat" 5-Jan-2010 5 "Sheep" 3

I want to display only the first "Horse" for record 4 "Dog", and also only the first开发者_如何转开发 "Horse" for record 5 "Cat".

Get it? ;)


I think this may do it--don't have data set up to test the query with. Check the comments for rationale.

WITH Source AS ( 
    SELECT key1, key2, datafield, date1, 
        ROW_NUMBER() OVER(PARTITION BY key1, key2 ORDER BY date1 ASC) AS dateorder 
        FROM table 
) 
SELECT L.key1, L.key2, L.datafield, DC.datafield2 
FROM Source AS L 
LEFT JOIN AllowedDataCombinationsTable DC 
    ON DC.datafield1 = L.datafield   --  DC Alias
LEFT JOIN Source AS R 
    ON R.Key1 = L.Key1 
    AND R.Key2 = L.Key2 
    AND DC.datafield2 = R.datafield   --  Changed alias from L to R
    AND R.dateorder = 1               --  Pick out lowest one
    AND R.dateorder < L.dateorder     --  Make sure it's not the same one


Well, I don't use WITH or OVER, so this is a different approach.. I might be over-simplifying something, but without having the data in front of me this is what I came up with:

SELECT distinct a.Key1, a.Key2, a.Datafield, 
       ISNULL(b.Datafield,'') as Datafield1, 
       ISNULL(b.Date,a.Date) as `Date`, 
       MIN(a.DateOrder) as DateOrder
FROM Source a
LEFT JOIN Source b 
     ON a.Key1 = b.Key1
     AND a.Key2 = b.Key2
     AND a.Dateorder <> b.Dateorder
LEFT JOIN AllowedDataCombinationsTable c
     ON a.Datafield = c.Datafield
     AND b.Datafield = c.Datafield1
GROUP BY a.Key1, a.Key2, a.Datafield, ISNULL(b.Datafield,''), ISNULL(b.Date,a.Date)
0

上一篇:

下一篇:

精彩评论

暂无评论...
验证码 换一张
取 消

最新问答

问答排行榜