How can I catch Sybase bcp errors reliably?
We're using named pipes with Sybase bcp so that we can compress output on-the-fly.
The Sybase bcp utility does not return much information in its exit code. Sybase documentation directs the user to inspect error messages written by the process.
This is a paraphrase of the error handling idiom we use, some error checking in the non-bcp parts of the script has been removed to shorten the example.
while :
do
{
开发者_StackOverflow rm -f $fifo
mkfifo $fifo
cat $fifo &
CatPid=$!
bcp $db.$owner.$table out $fifo -c $db_creds >$log 2>&1
grep -qi deadlock $log || break
# Must have been a deadlock, clean up.
kill $CatPid
} > $output
done
Basically, if the word 'deadlock' appears in bcp output messages, we try again.
Two questions
- Does this approach look reasonable?
- What other bcp errors than deadlock might we need to worry about?
I'm specifically interested in detecting transient bcp errors, where we can try again.
We use a compound statement so that we can insert headers and footers around the bcp data before the compression, but I've omitted that to simplify the example.
I had idea use named pipe for bcp out and compress data from Sybase ASE to file and then load into Sybase IQ using LOAD TABLE statement. Unfortunately there was a big performance lost. LOAD TABLE from named pipe was 10x slower than LOAD TABLE from file on HP-UX :-( I vote for implementing simple compress alghoritm directly into OC utils (bcp,isql).
So all you need is just reliable fail fast bcp. Bcp for some of Sybase versions has a command argument controlling max error-count. 1) if you set error count = 1, then it will fail more reliably. 2) The problem then boils down to trapping the exit code of bcp process, launched on background with &. I dont know what shell syntax should be used for this exactly, but there might be some common shell technique for this.
This is one is easy.
Does this approach look reasonable?
Not really. First, the shell script is not very good, it does not need all that work but I will leave that alone, as that is not the question.
Second, bcp
will not deadlock, even on an active database (unless you are doing something very strange in the database, or running multiple parallel bcp
streams), it waits for shared locks to clear, so there is no need to check for that.
Third, bcp
provides complete and full error messages. Use the -e bcp_err_file
invocation parameter. Then grep ... bcp_err_file
for errors or patterns ("^[Ee]rr"
and "^Msg"
are typical). I trap separately for errors; exceptions; and other messages.
Fourth, I would never retry forever like that within a shell script. Potential infinite loop, waste of resources. Have it execute once, and produce "success" xor "failure" and the list of errors. Any loop should be only for the list of tables to be exported.
It is written as a proper unix utility/command. If you do not specify an error file, sure, then all error messages go to
$stdout
and they are mixed up with progress messages. You can trap for errors in that stream, but that is legless; specify a separate error file.It is normal to capture
$stdout
to abcp_log_file
, but that is separate to thebcp_err_file
Inspecting exit status on Unixis a different thing. If
bcp
ran successfully (whether it produced error messages or not), it exits as "success"; you will get a non-zero exit status only if a unix program fails.it will tolerate any number of errors, unless you limit that by
-m max_errors
What other bcp errors than deadlock might we need to worry about?
Any and all errors, they cannot be predicted (it is an online server with fixed resources), capture all of them, and then inspect the bcp_err_file
. You can automate the inspection via grep
as detailed above.
I'm specifically interested in detecting transient bcp errors, where we can try again.
No problem at all. Detailed above. Transient errors are few and far between.
You should also worry about hard errors and resource errors, and not try again (because it will fail again).
Most of us worry about errors that result in missing rows, which means the bcp_data_file
is incomplete or unusable.
Is that really going to do what you want? My understanding of the bcp commandline tool is that there is no transaction - ie. if you are loading M rows, but inserting row N fails for any reason (constraints etc.), the first N-1 rows have been inserted. So restarting the whole file isn't a great idea.
You can use the -m X option to allow bcp to carry on in the face of up to X errors, and then try to identify which rows failed to insert and retry them.
You could also look into Michael Peppler's Sybase::BCP Perl module, but our investigations suggest that it may have issues with ASE 15.
精彩评论