Proper way of declaring cursor for huge number of updates
I need some suggestions as to whether my idea is ok or not. I have a situation where:
I need to d开发者_如何学Pythono updates to every row of a table. There is some logic involved in the updation. The logic is very simple, but it needs to be done for every row. There is a possiblity of every row getting updated.
Currently, I'm thinking about writing an ESQL/C program to do this. I'm thinking about loading every row into its equivalent C structure through a select for update cursor, run the logic and commit. What role does the HOLD keyword do on the cursor? I'm bit confused about the role of this.
These updates will be done during a system downtime period. The table contains approximately 130 million rows. It has about 45 columns. Most of the columns are of type SMALLINT and INTEGER.
Am I on the right track? Suggestions welcome.
The database will be Informix (IDS version 11.50 FC6)
The key to making this work is to do the work in the server, rather than making the server select each row, pass it to the client, and then accept the data back from the client.
UPDATE YourTable
SET y = (CASE WHEN z > x THEN p ELSE y)
WHERE key_column BETWEEN lo_val AND hi_val;
The complicated part will likely be splitting the work into manageable sub-transactions; that's what the 'lo_val .. hi_val' condition is about. If your logical logs are big enough to handle all 130 million rows being updated [about (2 * (row size + X) * number of rows), with X being a value around 20, I believe] with space to spare, then you can do it all at once. Clearly, this 'updates' every row.
If you decide you must do it in the client (a mistake, but ...), then:
You use a SELECT cursor with HOLD so that it stays open and correctly positioned across transactions. You start a transaction, fetch a few thousand rows, updating each one as needed. Make sure you are using a prepared UPDATE statement; maybe you use a WHERE CURRENT OF condition.
Do you suggest to put the update as part of the cursor in a stored procedure?
No, though you could do it in a stored procedure. It depends in part on whether this is something you're going to do on a regular basis; if so, maybe the stored procedure is a good idea, but I wouldn't for a one-off exercise.
It depends on how you are going to determine lo_val and hi_val. I'd probably use I4GL (because I'm fluent in it) and then I'd expect to prepare the UPDATE statement (with question marks in place of 'lo_val' and 'hi_val'), and then I'd expect to execute it a number of times, each time forming a single statement transaction. So, if you decided to go with a ranges of lo_val..hi_val from 000000..099999, 100000..199999, ... then you'd iterate:
for i = 0 to 10000000 step 100000
let j = i + 99999
execute p_update using i, j
end for
In I4GL you would not absolutely need to use a prepared statement. If you have IDS 11, you can prepare statements in SPL. In earlier versions, and without much of a performance hit (I doubt if you could measure it reliably), you could simply use:
CREATE PROCEDURE update_your_table()
DEFINE lo_val, hi_val INTEGER;
FOR lo_val = 0 TO 1000000 STEP 100000
LET hi_val = lo_val + 99999;
UPDATE YourTable
SET y = (CASE WHEN z > x THEN p ELSE y)
WHERE key_column BETWEEN lo_val AND hi_val;
END FOR;
END PROCEDURE;
Untested code - use at your own risk!
SPL is the way to go!.. but I suggest you replicate the table and test your mass update first.
精彩评论