开发者

SQL query to merge 2 tables with additional conditions?

I have 2 identical tables: user_id, name, age, date_added. USER_ID column may contain multiple duplicate IDs.

Need to merge those 2 tables into 1 with the following condition. If there are multiple records with identical 'name' for the same user then need to keep only the LATEST (by date_added) record. This script will be used with MSSQL 2005, but would also appreciate if somebody comes up with version that does not use ROW_NUMBER(). Need this script to reload a broken table once, performance is not critical.

example:

table1:

1,'john',21,01/01/2010

1,'john',15,01/01/2005

1,'john',71,01/01/2001

table2:

1,'john',81,01/01/2007

1,'john',15,01/01/2005

1,'john',11,01/01/2008

result:

1,'john',21,01/01/2010

UPDATE: I think that I've found my own solution. It is based on an answer for my previous question given by Larry Lustig and 开发者_运维百科Joe Stefanelli.

with tmp2 as ( SELECT * FROM table1 UNION

SELECT * FROM table2 )

SELECT * FROM tmp2 c1 WHERE (SELECT COUNT(*) FROM tmp2 c2 WHERE c2.user_id = c1.user_id AND c2.name = c1.name AND c2.date_added >= c1.date_added) <= 1

Could you please help me to convert this query to the one without 'WITH' clause?


Here's a variant of @Andomar's answer:

; with all_users as 
    (
    select  *
    from    table1 u1
    union all
    select  *
    from    table2 u2
    )
, ranker as (
     select  *, 
     rank() over (partition by userid order by recordtime) as [r]
)
select * from ranker where [r] = 1


Just in the interests of giving a different approach...

WITH distinctlist
     As (SELECT user_id,
                name
         FROM   table1
         UNION
         SELECT user_id,
                name
         FROM   table2)
SELECT C.*
FROM   distinctlist d
       CROSS APPLY (SELECT TOP 1 *
                    FROM   (SELECT TOP 1 *
                            FROM   table1
                            WHERE  user_id = d.user_id
                                   AND name = d.name
                            ORDER  BY date_added DESC
                            UNION ALL
                            SELECT TOP 1 *
                            FROM   table1
                            WHERE  user_id = d.user_id
                                   AND name = d.name
                            ORDER  BY date_added DESC) T
                    ORDER  BY date_added DESC) C  


You could use not exists, like:

; with all_users as 
        (
        select  *
        from    table1 u1
        union all
        select  *
        from    table2 u2
        )
select  *
from    all_users u1
where   not exists  
        (
        select  *
        from    all_users u2
        where   u1.name = u2.name
                and u1.record_time < u2.record_time
        )

If the database doesn't support CTE's, expand all_users in the two places it is used.

P.S. If there are only three columns, and no more, you could use an even simpler solution:

select  name
,       MAX(record_time)
from    (
        select  *
        from    table1 u1
        union all
        select  *
        from    table2 u2
        ) sub
group by
        name
0

上一篇:

下一篇:

精彩评论

暂无评论...
验证码 换一张
取 消

最新问答

问答排行榜