In MySQL, why do time-consuming updates get slowed down by other processes waiting for the table(s) being updated?
Once in a while, I need to perform a massive update to a very large table. If users continue hitting the Web site while the update is being run, there will be a line-up of MySQL clients.
It appears that the longer the line-up, the slower the main operation gets (i.e. it updates fewer rows per unit time). Killing开发者_开发知识库 those processes can speed things up, but they're bound to come back.
Is there a way to address this (other than by bringing the site down)? I don't mind the users waiting a few minutes, but once the line-up has reached a certain size, the operation never completes.
This applies to UPDATE statements, as well as statements resulting in a temporary table being created (e.g. ALTER TABLE)
The waiting connections take up memory, and they're queueing up lock requests on your large and busy table. Eventually you're going to exhaust your maximum DB connections or one of your memory pools due to the number of connections held open. If I had to guess, I'd guess that your slowdown is due to memory exhaustion and the resultant swap-thrashing.
If the update you're doing doesn't require consistency between rows in the large table, you can try lowering the isolation level of the update transaction, using SET TRANSACTION ISOLATION LEVEL. This will greatly decrease the amount of locking and work that MySQL normally does to provided each client "repeatable reads" on a table being updated and read concurrently. You could also try partitioning your large table and running one update per partition, or otherwise breaking up the update operation into multiple pieces so that the table isn't locked for a long time at any one stretch.
If you do require consistency to be maintained between the rows, i.e. the whole table has to go from state X to X' in a single transaction with no intermediate states ever being visible, you're not going to be able to use the above techniques. You might try cloning the table, doing the update on the new table, then renaming the old table out of the way and renaming the new table into its place. Since it's a large table, this may require a significant increase in the runtime and storage needed for the operation. There are also caveats for doing this when triggers and constraints are present. The benefit is that you avoid holding a write lock on the table being updated, except during the relatively fast rename operations. Your users will only be delayed during that small swap window, and this will likely not take so long as to cause the major slowdowns you've experienced.
精彩评论