Optimizing the stored procedure to process 3 million records
I have designed a stored procedure usign Sql Server 2005 below to compare 3 million records in each of the Profile and Source Table, and update the Source table with records exist in another table (PROFILE_BC) which will also have about 3 million records. I am trying to Optimize this code below. Can you suggest any other method ? I just worried that this will take more than about 6 hours to complete. Can we do the same using DTS ? And ideas how this can be done using DTS. Some suggested that there is a component called, Lookup, Fuzzy Lookup that can be used. Any ideas in optimizing the same are welcome.
USE Database
GO
/****** Object: StoredProcedure [dbo].[ProcName] Script Date: 11/13/2010 17:15:04 ******/
SET ANSI_NULLS ON
GO
SET QUOTED_IDENTIFIER ON
GO
CREATE PROCEDURE [dbo].[ProcName]
AS
BEGIN
SET NOCOUNT ON;
DECLARE @not_on_ebc_file_xx## char(2);
SET @not_on_ebc_file_xx## = 35;
DECLARE @voters_no varchar(18);
DECLARE @candidate_id char(10);
DECLARE @perm_disq_temp char(2);
DECLARE @voters_no_jms varchar(18);
DECLARE PROFILES_CURSOR CURSOR LOCAL FAST_FORWARD
FOR SELECT CP.CANDIDATE_ID, CP.VOTERS_NO FROM PROFILE CP INNER JOIN SOURCE SR ON
CP.CANDIDATE_ID = SR.CANDIDATE_ID
WHERE CP.CANDIDATE_ID NOT LIKE 'MA%';
OPEN 开发者_运维问答PROFILES_CURSOR;
FETCH NEXT FROM PROFILES_CURSOR
INTO @candidate_id, @voters_no;
WHILE @@FETCH_STATUS = 0
BEGIN
SELECT @voters_no_jms = VOTERS_NO FROM PROFILE_BC WHERE VOTERS_NO = @voters_no;
SELECT @perm_disq_temp = PERM_DISQ FROM SOURCE WHERE CANDIDATE_ID = @candidate_id;
IF (@voters_no_jms = @voters_no) -- record exists in jms_temp table/ebc file
BEGIN
IF (@perm_disq_temp = @not_on_ebc_file_xx##)
BEGIN
UPDATE SOURCE SET PERM_DISQ = '' WHERE CANDIDATE_ID = @candidate_id;
END
END
ELSE
BEGIN
IF (@perm_disq_temp = '' OR @perm_disq_temp IS NULL)
BEGIN
UPDATE SOURCE SET PERM_DISQ = @not_on_ebc_file_xx## WHERE CANDIDATE_ID = @candidate_id;
END
END
SET @voters_no_jms = '';
FETCH NEXT FROM PROFILES_CURSOR INTO @candidate_id, @voters_no;
END
CLOSE PROFILES_CURSOR;
DEALLOCATE PROFILES_CURSOR;
END
you should always try to avoid using cursors if you have performance on mind. Instead of using a cursor you could do this. Try to think in terms of sets when dealing with SQL. I commented out the update part of the query and added a select so that you can see the data.
BEGIN TRANSACTION
DECLARE @not_on_ebc_file_xx## char(2);
SET @not_on_ebc_file_xx## = 35;
--UPDATE SR
--SET PERM_DISQ =
-- CASE WHEN NOT PROFILE_BC.VOTERS_NO IS NULL THEN
-- CASE WHEN PERM_DISQ.PERM_DISQ = @not_on_ebc_file_xx## THEN ''
-- ELSE PERM_DISQ.PERM_DISQ
-- END
-- WHEN PERM_DISQ.PERM_DISQ = '' OR PERM_DISQ IS NULL THEN @not_on_ebc_file_xx##
-- ELSE PERM_DISQ.PERM_DISQ
-- END
SELECT CASE WHEN NOT PROFILE_BC.VOTERS_NO IS NULL THEN
CASE WHEN PERM_DISQ.PERM_DISQ = @not_on_ebc_file_xx## THEN ''
ELSE PERM_DISQ.PERM_DISQ
END
WHEN PERM_DISQ.PERM_DISQ = '' OR PERM_DISQ IS NULL THEN @not_on_ebc_file_xx##
ELSE PERM_DISQ.PERM_DISQ
END AS PERM_DISQ
FROM PROFILE CP
INNER JOIN SOURCE SR
ON CP.CANDIDATE_ID = SR.CANDID_ID
LEFT JOIN PROFILE_BC
ON CP.VOTERS_NO = PROFILE_BC.VOTERS_NO
LEFT JOIN SOURCE PERM_DISQ
ON CP.CANDIDATE_ID = PERM_DISQ.CANDIDATE_ID
WHERE CP.CANDIDATE_ID NOT LIKE 'MA%';
ROLLBACK TRANSACTION;
uncomment the update and set statement and comment out the select statement to update
精彩评论