SSIS - user variable used in derived column transform is not available - in some cases
Unfortunately I don't have a repro for my issue, but I thought I would try to describe it in case it sounds familiar to someone... I am using SSIS 2005, SP2.
My package has a package-scope user variable - let's call it user_var
- first step in the control flow is an Execute SQL task which runs a stored procedure. All that SP does is insert a record in a SQL table (with an identity column) and then go back and get the max ID value. The Execute SQL task saves this output into user_var
- the control flow then has a Data Flow Task - it goes and gets some source data, has a derived column which sets a column called run_id to user_var - and saves the data to a SQL destination
In most cases (this template is used for many packages, running every day) this all works great. All of the destination records created get set with a correct run_id.
However, in some cases, there is a set of the destination data that does not get run_id equal to user_var, but instead gets a value of 0 (0 is the default value for user_var).
I have 2 instances where this has happened, but I can't make it happen. In both cases, it was just less that 10,000 records that have run_id = 0. Since SSIS writes data out in 10,000 record blocks, this really makes me think that, for the first set of data written out, user_var was not yet set. Then, after that first block, for the rest of the data, run_id is set to a correct value.
But control passed on to my data flow from the Execute SQL task - i开发者_Python百科t would have seemed reasonable to me that it wouldn't go on until the SP has completed and user_var is set. Maybe it just runs the SP, but doesn't wait for it to complete?
In both cases where this has happened there seemed to be a few packages hitting the table to get a new user_var at about the same time. And in both cases lots of data was written (40 million rows, 60 million rows) - my thinking is that that means the writes were happening for a while.
Sorry to be both long-winded AND vague. A winning combination! Does this sound familiar to anyone? Thanks.
Updating to show the SP I use to get the user_var:
CREATE PROCEDURE [dbo].[sp_GetRunIDForPackage] (@pkg varchar(50)) AS
-- add a new entry for this run of this package - the RUN_ID is an IDENTITY column and so
-- will get created for us
INSERT INTO shared.STAGE_LOAD_JOB( EFFECTIVE_TS, EXECUTED_BY )
VALUES( getdate(), @pkg )
-- now go back into the table and get the new RUN_ID for this package
SELECT MAX( RUN_ID )
FROM shared.STAGE_LOAD_JOB
WHERE EXECUTED_BY = @pkg
Is this variable being accessed lots of times, from lots of places? Do you have a bunch of parallel data flows using the same variable?
We've encountered a bug in both SQL 2005 and 2008 whereby a "race condition" causes the variable to be inaccessable from some threads, and the default value is used. In our case, the variable was our "base folder" location for packages, causing our overall execution control package to not find its sub-packages.
More detail here: SSIS Intermittent variable error: The system cannot find the file specified
Unfortunately, the work-around is to hard-code a default value into the variable that will work when the race condition happens. Easy for us (set base folder to be correct for our prod environment), but looks a lot hard for your issue.
Perhaps you could use multiple variables (one for each data flow), and a bunch of Execute SQL tasks to populate those variables? REALLY ugly, but it should help.
Did you check the value of user_var before getting to the Derived Column Component? It sounds like user_var may be 0 so you are doing run_id = user_var; run_id = 0. I may be naive to think it is that simple but that's the first thing I would check.
Given the procedure code, you might want to replace this:
SELECT MAX( RUN_ID ) FROM shared.STAGE_LOAD_JOB WHERE EXECUTED_BY = @pkg
with this:
select scope_identity()
The scope_identity() function returns the identity that was entered in the current scope, which is the procedure. Not sure if this will solve the problem, but I find it best to work through them all as they might have unrelated consequences.
精彩评论