change ID number to smooth out duplicates in a table

2023-03-17 22:46 问答作者：

I have run into this problem that I'm trying to solve: Every day I import new records into a table that have an ID number.

Most of them are new (have never been seen in the system before) but some are coming in again. What I need to do is to append an alpha to the end of the ID number if the number is found in the archive, but only if the data in the row is different from the data in the archive, and this needs to be done sequentially, IE, if 12345 is seen 开发者_StackOverflow中文版a 2nd time with different data, I change it to 12345A, and if 12345 is seen again, and is again different, I need to change it to 12345B, etc.

Originally I tried using a where loop where it would put all the 'seen again' records in a temp table, and then assign A first time, then delete those, assign B to what's left, delete those, etc., till the temp table was empty, but that hasn't worked out.

Alternately, I've been thinking of trying subqueries as in:

update table
set IDNO= (select max idno from archive) plus 1

Any suggestions?

How about this as an idea? Mind you, this is basically pseudocode so adjust as you see fit.

With "src" as the table that all the data will ultimately be inserted into, and "TMP" as your temporary table.. and this is presuming that the ID column in TMP is a double.

do
    update tmp set id = id + 0.01 where id in (select id from src);
until no_rows_changed;

alter table TMP change id into id varchar(255);

update TMP set id = concat(int(id), chr((id - int(id)) * 100 + 64);

insert into SRC select * from tmp;

What happens when you get to 12345Z?

Anyway, change the table structure slightly, here's the recipe:

Drop any indices on ID.
Split ID (apparently varchar) into ID_Num (long int) and ID_Alpha (varchar, not null). Make the default value for ID_Alpha an empty string ('').
So, 12345B (varchar) becomes 12345 (long int) and 'B' (varchar), etc.
Create a unique, ideally clustered, index on columns ID_Num and ID_Alpha.
Make this the primary key. Or, if you must, use an auto-incrementing integer as a pseudo primary key.
Now, when adding new data, finding duplicate ID number's is trivial and the last ID_Alpha can be obtained with a simple max() operation.
Resolving duplicate ID's should now be an easier task, using either a while loop or a cursor (if you must).
But, it should also be possible to avoid the "Row by agonizing row" (RBAR), and use a set-based approach. A few days of reading Jeff Moden articles, should give you ideas in that regard.

Here is my final solution:

update a
set IDnum=b.IDnum
from tempimiportable A inner join 
    (select * from archivetable
     where IDnum in 
     (select max(IDnum) from archivetable
      where IDnum in 
       (select IDnum from tempimporttable)
      group by left(IDnum,7) 
      )
     ) b
on b.IDnum like a.IDnum + '%'
WHERE 
*row from tempimport table = row from archive table*

to set incoming rows to the same IDnum as old rows, and then

update a
set patient_account_number = case 
    when len((select max(IDnum) from archive where left(IDnum,7) = left(a.IDnum,7)))= 7 then a.IDnum + 'A'
    else left(a.IDnum,7) + char(ascii(right((select max(IDnum) from archive where left(IDnum,7) = left(a.IDnum,7)),1))+1)
    end
from tempimporttable a
where not exists ( *select rows from archive table* )

I don't know if anyone wants to delve too far into this, but I appreciate contructive criticism...

继续阅读：duplicate-data sql-server-2000

change ID number to smooth out duplicates in a table

更多精彩内容

精彩评论

最新问答

央视是哪个频道？

请问买过的朋友，舒提啦旅行箱实际使用体验如何？？

检查不孕不育需要的费用？

海信ULED电视画质有什么不同的地方?？

钉子可以挂的住画框幕布吗？

问答排行榜

河神2九牛入海钓河妖是第几集河妖什么来历可活吞牛？

性激素六项检查的最佳时间是多久？多少钱？？

Easiest way to get words of one line from istream into a vector?

《梦在燃烧 (《三国演义》动画片主题曲)》MP3歌词-汤子星？

抽烟只抽炫赫门？