Consolidating values and deleting duplicates in a junction table using CTE
I have the following schema:
Parcels Segments SegmentsParcels
========= ========== =================
ParcelID SegmentID ParcelID
... Name SegmentID
... id
A user of the data wants to consolidate Segments.Names and gave me a list of current Segment.Names mapped to new Segment.Names (all of which currently exist).
So now I have this list in 开发者_如何学Pythona temporary table with the currentID and newID to map to.
What I want to do is update the SegmentID in SegmentsParcels based on this map. I could use the statement:
update SegmentParcels set segmentID = [newID] from newsegments where segmentID = currentid
but this will create some duplicates I have a unique constraint on ParcelID and SegmentID in SegmentParcels.
What is the best way to go about this? I considered removing the constraint and then dealing with removing the duplicates (which I did at one point and could probably do again) but I was hoping there was a simpler way.
try something like this:
CREATE TABLE #Results (oldID int, newid int)
update sp
set segmentID = dt.newID
OUTPUT sp.segmentID, dt.newID
INTO #Results
FROM SegmentParcels sp
INNER JOIN (SELECT DISTINCT
segmentID, newID
FROM newsegments ns
INNER JOIN SegmentParcels sp ON ns.currentid = sp.segmentID
LEFT OUTER JOIN SegmentParcels sp2 ON ns.currentid = sp2.segmentID AND sp.ParcelID=sp2.ParcelID
WHERE sp2.ParcelID IS NULL
) dt ON dt.segmentID=sp.segmentID
I threw in the OUTPUT clause which is SQL Server 2005 and up, just because it might be helpful if you have a large script running. This way you have the old and new values.
I ended up dropping the constraint and deleting the duplicates using a common table expression.
Here is the sql I used:
-- drop unique constraint
ALTER TABLE [dbo].[SegmentParcels] DROP CONSTRAINT [uc_SegmentID_ParcelID]
GO
-- update segment ids to new values from map
update SegmentParcels
set segmentID = [newID] from newsegments where segmentID = currentid
GO
-- use common table expression to delete duplicates
WITH Duplicates(SegmentID, ParcelID, Id)
AS
(
SELECT SegmentID, ParcelID, Min(Id) Id
FROM SegmentParcels
GROUP BY segmentID, parcelID
HAVING Count(*) > 1
)
DELETE FROM SegmentParcels
WHERE Id IN (
SELECT SegmentParcels.Id
FROM SegmentParcels
INNER JOIN Duplicates
ON SegmentParcels.segmentID = Duplicates.SegmentID
AND SegmentParcels.parcelID = Duplicates.ParcelID
AND SegmentParcels.Id <> Duplicates.Id
)
-- add unique constraint back
ALTER TABLE [dbo].[SegmentParcels] ADD CONSTRAINT [uc_SegmentID_ParcelID] UNIQUE NONCLUSTERED
(
[segmentID] ASC,
[parcelID] ASC
)WITH (PAD_INDEX = OFF, STATISTICS_NORECOMPUTE = OFF, SORT_IN_TEMPDB = OFF, IGNORE_DUP_KEY = OFF, ONLINE = OFF, ALLOW_ROW_LOCKS = ON, ALLOW_PAGE_LOCKS = ON) ON [PRIMARY]
GO
精彩评论