开发者

Identifying duplicate combinations for n-field data table on Oracle

I managed to make up sql query to update duplicate rows containing combinations to null value for 2-field table. However, I am stuck with more than 2-field table.

My 2-field solution is:

Inserting test data for combinations' table:

create table combinations as 
select 1 col1, 2 col2 from dual --row1 
union all 
select 2, 1 from dual --row2
union all 
select 1, 3 from dual --row3
union all 
select 1,4 from dual; --row4

From combinations' table row1 and row2 are duplicate because ordering of elements doesn't matter.

Updating duplicate combinations to null for 2 fields (updating row2 to null):

update combinations 
set col1=null, col2=null 
where rowid IN(
select x.rid from (
    select 
        rowid rid, 
        col1, 
        col2, 
        row_number() over (partition by least(col1,col2), greatest(col1,col2)
                               order by rownum) duplicate_row 
    from combinations) x 
where duplicate_row > 1);

My code above depends on least(,) and greatest(,) functions and that's why it works neatly. Any ideas to adjusting this code to 3-field table?

Inserting test data for combinations2' table (3-fields)

create table combinations2 as
select 1 col1, 2 col2, 3 col3 from dual --row1
union all
select 2, 1, 3 from dual --row2
union all
select 1, 3, 2 from dual --row3;

Combinations2 table with 3-fields has row1, row2, row3 whic开发者_运维技巧h are equal. My goal is to update row2 and row3 to null.


update combinations2
set col1 = NULL
  , col2 = NULL
  , col3 = NULL
where rowid in (
            select r 
            from
                (
                -- STEP 4
                select r, row_number() over(partition by colls order by colls) duplicate_row
                from
                    (
                    -- STEP 3
                    select r, c1 || '_' || c2 || '_' || c3 colls
                    from
                        (
                        -- STEP 2
                        select r
                              , max(case when rn = 1 then val else null end) c1 
                              , max(case when rn = 2 then val else null end) c2
                              , max(case when rn = 3 then val else null end) c3
                        from 
                            (
                            -- STEP 1
                            select r
                                  , val
                                  , row_number() over(partition by r order by val) rn
                            from
                                (
                                  select rowid as r, col1 as val
                                  from combinations2
                                union all
                                  select rowid, col2
                                  from combinations2
                                union all
                                  select rowid, col3
                                  from combinations2
                                )
                            )
                        group by r
                        )
                    )
                )
            where duplicate_row > 1
            )
;
  • step 1: sort the values in the columns
  • step 2: build rows with sorted values
  • step 3: concatenate columns to a string
  • step 4: find duplicates
0

上一篇:

下一篇:

精彩评论

暂无评论...
验证码 换一张
取 消

最新问答

问答排行榜