Are MySQL Update Statements Too Expensive?
I've been reading as much as I can about how to write a scalable MySQL schema but I'm still not sure if this is a good idea.
For what it's worth, I'm hosting this project on EC2 with RDS.
In my database, I have a core table that is going to be more writes than reads (Approximate开发者_StackOverflow中文版ly 70% writes and 30% reads I'd guess).
However, when I create new rows in the table, I'm going to need to add updates to it every 5 seconds or so. On the whole, since multiple rows are going to be added/updated every second that means an UPDATE statement is executing every second or so.
From based on what I've been reading, MySQL has something called Table Locking that happens when you are writing to a table? Since this table is also going to be being read from significantly, would it cause too much overhead/locking to use UPDATE statements on the rows?
My options (as far as I know) are:
- Do UPDATE statements on a large table on a frequent basis (every second or maybe more)
- Have a staging table where I create rows, update them, and when ready (about 20 mins until the row is finalized), ship the row from staging table to main table.
I want to avoid the staging table because it makes my users wait about 20 minutes before they can see content, but I'm wondering if it's a necessary evil or not.
Any other ideas? Suggestions?
Thank you!
Whether table locks are applied or not depends on the storage engine. MyISAM does table locks, InnoDB does row locks.
As you want to read and write rows, you will have to use InnoDB is it will allow concurrent access to each row (readers don't block writers, writers don't block readers)
If you udpate the rows based on the primary key, they should be fairly fast (if your server can keep up with the IO that is beeing generated by this).
Regardless of whether Update statements are expensive, it looks like you are going to need to do them anyway. If you had the option of not updating the table, clearly that would be more performant, but I assume that is not the case.
The real question you seem to be asking seems to be more about concurrency than performance anyway. Specifically, you seem to be asking whether you get better concurrency with lots of small updates or fewer batch updates each containing the same total number of changes.
In my experience you are much better off doing a lot of small updates rather than batching them for purposes of concurrency. However, you will generally have better performance (for the updates) if you batch them up.
Another thing to understand is that there are multiple types of locks. The locks on the table for updates and selects are not the same, and there are also considerations about whether your statements lock just a row, page or the entire table. Usually locks are fairly localized, but for this type of application you will want to read up on the specifics of locking strategies and how to manipulate them to your specific needs.
精彩评论