SQL search and destroy duplicates
I have a table with fields (simplified):
id, fld1, fld2, fld3.
id
is a numeric primary key field.
There are duplicates: id
differs but fld1
, fld2
and fld3
are identical over 2 or more rows. There are also entries where the values occur only once, i.e. non-duplicates, of course.
Of each set of duplicate entries, I want to retain only the entry with the highest ID. I was planning to first list the doomed rows and then to delete them.
My first stab at it was this:
SELECT * FROM tab1 t1 WHERE EXISTS (
SELECT COUNT(*) FROM tab1 t2
WHERE t1.fld1 = t2.fld1 AND t1.fld2 = t2.fld2 AND t1.fld3 = t2.fld3
AND t1.id < MAX(t2.id)
HAVING COUNT(*) > 1
GROUP BY t2.fld1, t2.fld2, t2.fld3)
But (in Oracle) I'm getting a Missi开发者_Python百科ng right parenthesis
error message. I think this needs a new approach altogether, but my SQL-fu is not up to the task. Help appreciated!
Edit:
With 'real' data fields:
select x.leg_id, x.airline_des, x.flight_nr, x.suffix, x.flight_id_date, x.lt_flight_id_date
from fdb_leg x
join ( select max(t.leg_id) 'max_id',
t.airline_des, t.flight_nr, t.suffix, t.flight_id_date, t.lt_flight_id_date
from fdb_leg t
group by t.airline_des, t.flight_nr, t.suffix, t.flight_id_date, t.lt_flight_id_date
having count(*) > 1) y on y.max_id > x.leg_id
and y.airline_des = x.airline_des and y.flight_nr = x.flight_nr and y.suffix = x.suffix
and y.flight_id_date = x.flight_id_date and x.lt_flight_id_date = y.lt_flight_id_date
Response is:
ORA-00923: FROM keyword not found where expected
Oracle 9i+, Using WITH:
To get the list of doomed entries, use:
WITH keepers AS (
SELECT MAX(t.id) 'max_id',
t.fld1, t.fld2, t.fld3
FROM TABLE_1 t
GROUP BY t.fld1, t.fld2, t.fld3
HAVING COUNT(*) > 1)
SELECT x.id,
x.fld1, x.fld2, x.fld3
FROM TABLE_1 x
JOIN keepers y ON y.max_id > x.id
AND y.fld1 = x.fld1
AND y.fld2 = x.fld2
AND y.fld3 = x.fld3
Non-WITH Equivalent:
To get the list of doomed entries, use:
SELECT x.id,
x.fld1, x.fld2, x.fld3
FROM TABLE_1 x
JOIN (SELECT MAX(t.id) 'max_id',
t.fld1, t.fld2, t.fld3
FROM TABLE_1 t
GROUP BY t.fld1, t.fld2, t.fld3
HAVING COUNT(*) > 1) y ON y.max_id > x.id
AND y.fld1 = x.fld1
AND y.fld2 = x.fld2
AND y.fld3 = x.fld3
You can delete them in one shot, like this:
SQL> create table mytable (id, fld1, fld2, fld3)
2 as
3 select 1, 1, 1, 1 from dual union all
4 select 2, 1, 1, 1 from dual union all
5 select 3, 2, 2, 2 from dual union all
6 select 4, 2, 3, 2 from dual union all
7 select 5, 2, 3, 2 from dual union all
8 select 6, 2, 3, 2 from dual
9 /
Table created.
SQL> delete mytable
2 where id not in
3 ( select max(id)
4 from mytable
5 group by fld1
6 , fld2
7 , fld3
8 )
9 /
3 rows deleted.
SQL> select * from mytable
2 /
ID FLD1 FLD2 FLD3
---------- ---------- ---------- ----------
2 1 1 1
3 2 2 2
6 2 3 2
3 rows selected.
Regards, Rob.
Ugh, I get it. Scratch that.
This will identify the ID's needed to delete.
Select
fld1
, fld2
, fld3
, Max(ID)
From table_name
Group By
fld1
, fld2
, fld3
精彩评论