Querying and making dumps of production databases
I have some large MySQL production tables that I need to dump so that I can run some analyses in Hadoop. What best practices I should know about making dumps and queries of p开发者_运维知识库roduction databases? Do I need to worry about affecting production performance if I'm just making dumps/reads (and no writes)?
Yes, you do need to worry about it. MyISAM tables will lock while doing reads, no modifications will be allowed, just other reads. InnoDB doesn't have this problem, but it will lock if you are querying large sets of data. Dumps will lock any table type in almost all cases.
What is usually done is you setup a slave database and do all your queries on that. Even setup a slave just for reporting purposes. You will need to stop the MySQL server to setup a master/slave configuration. But once you do this once, you can use a slave to create other slaves.
It depends on the database size and replication setup.
The optimal setup (opinion) would be to have a master that constantly writes to a slave (which is otherwise read-only). That way you could run a mysqldump on the slave itself and production applications can continue to write to the master without having worry about mysqldump getting in the way.
Disclaimer: I'm a MySQL hobbyist, not a DBA. Grain of salt, etc.
精彩评论