Separating SQL data into two tables
Got a slightly tricky problem I'd like some advice on. I have a source SQL table that I need to separat开发者_运维知识库e into two separate tables. The source contains a bunch of columns, including a postal address.
What I need to do is add the rows containing first instance of any address to the first table, Then any subsequent occurrences of the same address (if any) are to be added to the second table, with an additional column containing the ID of the row in the first table (for joining purposes).
Can anyone suggest how to do this? I'd prefer to keep it all within T-SQL (MS SQL Server 2008), but if necessary I can knock together some C# as part of the process.
You should be able to create a suitable CTE (Common Table Expression) do divide those addresses apart. You didn't provide much details - so I tried to make up some column and table names - adapt those as needed.
Try this - it will delete all duplicates from your table:
;WITH DuplicateAddr AS
(
SELECT
(list of columns you need),
ROW_NUMBER() OVER (PARTITION BY Street,ZipCode,City
ORDER BY DateLastChange DESC) 'RowNum'
FROM dbo.Addresses
)
SELECT (list of columns)
FROM DuplicateAddr
WHERE RowNum > 1
GO
With this CTE, you "partition" your data according to a list of columns - use those you need to use. Each partition of data (e.g. each address defined by Street,ZipCode,City
) will get sequential numbers, based on some ordering (that you define in the CTE - I chose DateLastChange
as one possibility).
So the rows with RowNum = 1
make up the set of "first" addresses - instead of just a SELECT (columns) FROM DuplicateAddr
, you can of course also do an INSERT INTO ....
based on that criteria.
All the rows with a RowNum > 1
are the "duplicate" or additional addresses - insert those into your second table.
精彩评论