Stored procedure performs terribly - increase timeout or fix the problem
I've inherited a front end written by a third-party. That front end interacts with Oracle through procedures written by a different third-party. The stored procedure in question requires 2 minutes and 36 seconds to return search results when it is manually executed. I can't see the procedure and that team has suggested that I increase the timeout in the Web application (hosted on a shared server).
In my world, anything over 30 seconds would require a performance fix before being deployed to production with a 开发者_Go百科few exceptions (legacy code, crazy reports, etc.). The option that was suggested to me was to increase the timeout from 30 seconds (explicitly added by the front end developer) to 180 seconds.
My question to you: What are the risks with taking the easy approach and increasing the timeout? If possible, please provide links to articles that support your views so I can reference them.
Also feel free to chime in if you believe this is a non-issue.
You should not increase the timeout to hide the problem. You've positively identified a performance problem -- it's something that should be fixed, not swept under the rug.
Two things to do:
Get a SQL trace of the stored procedure.
exec dbms_monitor.session_trace_enable(binds => false, waits => true);
exec poor_performing_procedure();
exec dbms_monitor.session_trace_disable();
Look to see what statements are being run how often, and how much time is spent running them.
Add hooks into DBMS_PROFILER in the code for the stored procedure.
I've got code like this in all my packages so that I can determine whether or not to profile them by setting a package variable:
PROCEDURE profiler_control(p_start_stop IN VARCHAR2, p_run_comm IN VARCHAR2, p_ret OUT BOOLEAN) AS
l_ret_code INTEGER;
BEGIN
l_ret_code:=dbms_profiler.internal_version_check;
IF l_ret_code !=0 THEN
p_ret:=FALSE;
ELSIF p_start_stop NOT IN ('START','STOP') THEN
p_ret:=FALSE;
ELSIF p_start_stop = 'START' THEN
l_ret_code:=DBMS_PROFILER.START_PROFILER(run_comment1 => p_run_comm);
IF l_ret_code=0 THEN
p_ret:=TRUE;
ELSE
p_ret:=FALSE;
END IF;
ELSIF p_start_stop = 'STOP' THEN
l_ret_code:=DBMS_PROFILER.FLUSH_DATA;
l_ret_code:=DBMS_PROFILER.STOP_PROFILER;
IF l_ret_code=0 THEN
p_ret:=TRUE;
ELSE
p_ret:=FALSE;
END IF;
END IF;
END profiler_control;
Inside the procedures, then, there's code like:
create or replace procedure poorly_performing_procedure()
begin
if run_profiler then
profiler_control('START', 'poorly_performing_procedure', g_retval);
end if;
...
if run_profiler then
profiler_control('STOP', 'poorly_performing_procedure', g_retval);
end if;
end poorly_performing_procedure;
/
Oracle provides scripts (one named profiler.sql
) you can use to get pretty reports to show how many times each PL/SQL statement/operation was executed during a run. Here's a link to the DBMS_PROFILER documentation for 10g.
The issue with increasing timeouts globally is that you could run into several issues down the road:
- Denial of service attacks.
- Resource exhaustion on the server.
- Reduced throughput.
When you increase timeouts what you are telling the server is that it needs to keep the thread that it is using to service that request running. The server will have a limited number of threads, so a thread that it keeps running for extended periods of time is a thread that is not available to service other requests. If you have a lot of requests that take a long time to run, then eventually you run out of threads and the server becomes unresponsive.
Whether this matters at all for you will depend on how many requests are made to that particular stored procedure. If there is only one request made every so often, then it's not a big deal. However, the problem with setting the timeout globally is that it now applies to ALL requests, so if there are other requests that might take a long time to run, you will be extending their duration as well.
If you really want the procedure to complete, you should run it as a background task (not in the script context) and report its state in a resource accessible via AJAX
or a plain page.
If the procedure runs in the script context, then whenever the script dies, the Oracle
session dies too and the procedure is rolled back if its not complete yet.
This can be caused by the reasons other than the timeout (connection dies, user closes the page etc.)
I don't think increasing the timeout to 180 seconds is a good idea. I worked for a fast growing company for 2 years. During that time we had dozens of stored procedures that had moving execution times. They started out running at under 1 second, then they took 30 seconds, and then eventually 2 or 3 minutes. Once they got to 2 minutes they would cause a timeout in the site, we would catch it, and rewrite the proc to be more efficient. Long story short if you up the time to 180 seconds that means you might be increasing the timeout window to 360 seconds in a month and then 720 seconds in 2 months. You can see where this is going. If others disagree then you need to understand where they are coming from because any kind of data growth is going to slow your performance down.
You say that you can't see the procedure. Do you have access to the database, and can you make changes to it? If you can't access the database and/or can't make changes to it, I think your options are limited to:
- Increase the timeout, and
- Encourage the outside vendors to address the problem.
As others have said, increasing the timeout is not a good solution for all sorts of reasons. Encouraging the outside vendors to help (by, for example, threatening to replace their application with a competitor's, or refusing to pay license fees until your problems are addressed) may be your best bet. As a rule, the person you want to talk to about this is the SALES GUY, not the technical staff. The techie's do not, in general, give a hoot about losing revenue because it doesn't affect them directly. The sales guy DOES care because if you bail or refuse to pay it probably has a direct impact on his paycheck, so he has a vested interested in keeping you happy, and he probably has some level of influence over the tech's. Key catchphrases for dealing with sales guys are "...performance is unacceptable in our environment...", "...unable to support the business using your application...", "...will not pay one cent more for this application until it meets our needs...", and the always popular "...we will be evaluating alternative solutions...". Remember, the sales dude is there to be your interface to his company - so get inter his face - or better yet, have some manager type chew on the sales dude for a while. (That's what managers are for...). The squeaky wheel gets the grease...
If on the other hand you can modify the database it might serve you well to do as @Adam Munsch suggested and figure out what SQL statements are running so doggone slow. You might be able to dramatically improve the situation by adding an index or two.
Good luck.
精彩评论