Identifying duplicate combinations for n-field data table on Oracle
I managed to make up sql query to update duplicate rows containing combinations to null value for 2-field table. However, I am stuck with more than 2-field table.
My 2-field solution is:
Inserting test data for combinations' table:
create table combinations as
select 1 col1, 2 col2 from dual --row1
union all
select 2, 1 from dual --row2
union all
select 1, 3 from dual --row3
union all
select 1,4 from dual; --row4
From combinations' table row1 and row2 are duplicate because ordering of elements doesn't matter.
Updating duplicate combinations to null for 2 fields (updating row2 to null):
update combinations
set col1=null, col2=null
where rowid IN(
select x.rid from (
select
rowid rid,
col1,
col2,
row_number() over (partition by least(col1,col2), greatest(col1,col2)
order by rownum) duplicate_row
from combinations) x
where duplicate_row > 1);
My code above depends on least(,) and greatest(,) functions and that's why it works neatly. Any ideas to adjusting this code to 3-field table?
Inserting test data for combinations2' table (3-fields)
create table combinations2 as
select 1 col1, 2 col2, 3 col3 from dual --row1
union all
select 2, 1, 3 from dual --row2
union all
select 1, 3, 2 from dual --row3;
Combinations2 table with 3-fields has row1, row2, row3 whic开发者_运维技巧h are equal. My goal is to update row2 and row3 to null.
update combinations2
set col1 = NULL
, col2 = NULL
, col3 = NULL
where rowid in (
select r
from
(
-- STEP 4
select r, row_number() over(partition by colls order by colls) duplicate_row
from
(
-- STEP 3
select r, c1 || '_' || c2 || '_' || c3 colls
from
(
-- STEP 2
select r
, max(case when rn = 1 then val else null end) c1
, max(case when rn = 2 then val else null end) c2
, max(case when rn = 3 then val else null end) c3
from
(
-- STEP 1
select r
, val
, row_number() over(partition by r order by val) rn
from
(
select rowid as r, col1 as val
from combinations2
union all
select rowid, col2
from combinations2
union all
select rowid, col3
from combinations2
)
)
group by r
)
)
)
where duplicate_row > 1
)
;
- step 1: sort the values in the columns
- step 2: build rows with sorted values
- step 3: concatenate columns to a string
- step 4: find duplicates
精彩评论