SQL query to merge 2 tables with additional conditions?
I have 2 identical tables: user_id, name, age, date_added. USER_ID column may contain multiple duplicate IDs.
Need to merge those 2 tables into 1 with the following condition. If there are multiple records with identical 'name' for the same user then need to keep only the LATEST (by date_added) record. This script will be used with MSSQL 2005, but would also appreciate if somebody comes up with version that does not use ROW_NUMBER(). Need this script to reload a broken table once, performance is not critical.
example:
table1:
1,'john',21,01/01/2010
1,'john',15,01/01/2005
1,'john',71,01/01/2001
table2:
1,'john',81,01/01/2007
1,'john',15,01/01/2005
1,'john',11,01/01/2008
result:
1,'john',21,01/01/2010
UPDATE: I think that I've found my own solution. It is based on an answer for my previous question given by Larry Lustig and 开发者_运维百科Joe Stefanelli.
with tmp2 as ( SELECT * FROM table1 UNION
SELECT * FROM table2 )SELECT * FROM tmp2 c1 WHERE (SELECT COUNT(*) FROM tmp2 c2 WHERE c2.user_id = c1.user_id AND c2.name = c1.name AND c2.date_added >= c1.date_added) <= 1
Could you please help me to convert this query to the one without 'WITH' clause?
Here's a variant of @Andomar's answer:
; with all_users as
(
select *
from table1 u1
union all
select *
from table2 u2
)
, ranker as (
select *,
rank() over (partition by userid order by recordtime) as [r]
)
select * from ranker where [r] = 1
Just in the interests of giving a different approach...
WITH distinctlist
As (SELECT user_id,
name
FROM table1
UNION
SELECT user_id,
name
FROM table2)
SELECT C.*
FROM distinctlist d
CROSS APPLY (SELECT TOP 1 *
FROM (SELECT TOP 1 *
FROM table1
WHERE user_id = d.user_id
AND name = d.name
ORDER BY date_added DESC
UNION ALL
SELECT TOP 1 *
FROM table1
WHERE user_id = d.user_id
AND name = d.name
ORDER BY date_added DESC) T
ORDER BY date_added DESC) C
You could use not exists
, like:
; with all_users as
(
select *
from table1 u1
union all
select *
from table2 u2
)
select *
from all_users u1
where not exists
(
select *
from all_users u2
where u1.name = u2.name
and u1.record_time < u2.record_time
)
If the database doesn't support CTE's, expand all_users
in the two places it is used.
P.S. If there are only three columns, and no more, you could use an even simpler solution:
select name
, MAX(record_time)
from (
select *
from table1 u1
union all
select *
from table2 u2
) sub
group by
name
精彩评论