开发者

Split up a mysqldump file with multiple databases, by database

I have a mysqldump file of multiple databases (5). One of the database takes a very long time to load, is there a way to either split the m开发者_开发知识库ysqldump file by database, or just tell mysql to load only one of the specified databases?

Manish


This Perl script should do the trick.

#!/usr/bin/perl -w
#
# splitmysqldump - split mysqldump file into per-database dump files.

use strict;
use warnings;

my $dbfile;
my $dbname = q{};
my $header = q{};

while (<>) {

    # Beginning of a new database section:
    # close currently open file and start a new one
    if (m/-- Current Database\: \`([-\w]+)\`/) {
    if (defined $dbfile && tell $dbfile != -1) {
        close $dbfile or die "Could not close file!"
    } 
    $dbname = $1;
    open $dbfile, ">>", "$1_dump.sql" or die "Could not create file!";
    print $dbfile $header;
    print "Writing file $1_dump.sql ...\n";
    }

    if (defined $dbfile && tell $dbfile != -1) {
    print $dbfile $_;
    }

    # Catch dump file header in the beginning
    # to be printed to each separate dump file.  
    if (! $dbname) { $header .= $_; }
}
close $dbfile or die "Could not close file!"

Run this for the dump file containing all databases

./splitmysqldump < all_databases.sql


Or, it is possible to save every database into separate file directly...

#!/bin/bash
dblist=`mysql -u root -e "show databases" | sed -n '2,$ p'`
for db in $dblist; do
    mysqldump -u root $db | gzip --best > $db.sql.gz
done


Here is a great blog post I always re-refer to to do this kind of thing with a mysqldump.

http://gtowey.blogspot.com/2009/11/restore-single-table-from-mysqldump.html

You can easily extend it to extract individual db's.


I've been working on a python script which splits one big dump file into small ones, one per database. It's name is dumpsplit and here's a scratch:

#!/usr/bin/env python
# -*- coding: utf-8 -*-
import sys
import re
import os

HEADER_END_MARK = '-- CHANGE MASTER TO MASTER_LOG_FILE'
FOOTER_BEGIN_MARK = '\/\*\!40103 SET TIME_ZONE=@OLD_TIME_ZONE \*\/;'
DB_BEGIN_MARK = '-- Current Database:'

class Main():
    """Whole program as a class"""

    def __init__(self,file,output_path):
        """Tries to open mysql dump file to call processment method"""
        self.output_path = output_path
        try:
            self.file_rsrc = open(file,'r')
        except IOError:
            sys.stderr.write('Can\'t open %s '+file)
        else:
            self.__extract_footer()
            self.__extract_header()
            self.__process()

    def __extract_footer(self):
        matched = False
        self.footer = ''
        self.file_rsrc.seek(0)
        line = self.file_rsrc.next()
        try:
            while line:
                if not matched:
                    if re.match(FOOTER_BEGIN_MARK,line):
                        matched = True
                        self.footer = self.footer + line
                else:
                    self.footer = self.footer + line
                line = self.file_rsrc.next()
        except StopIteration:
            pass
        self.file_rsrc.seek(0)

    def __extract_header(self):
        matched = False
        self.header = ''
        self.file_rsrc.seek(0)
        line = self.file_rsrc.next()
        try:
            while not matched:
                self.header = self.header + line
                if re.match(HEADER_END_MARK,line):
                    matched = True
                else:
                    line = self.file_rsrc.next()
        except StopIteration:
            pass
        self.header_end_pos = self.file_rsrc.tell()
        self.file_rsrc.seek(0)

    def __process(self):
        first = False
        self.file_rsrc.seek(self.header_end_pos)
        prev_line = '--\n'
        line = self.file_rsrc.next()
        end = False
        try:
            while line and not end:
                    if re.match(DB_BEGIN_MARK,line) or re.match(FOOTER_BEGIN_MARK,line):
                    if not first:
                        first = True
                    else:
                        out_file.writelines(self.footer)
                        out_file.close()
                    if not re.match(FOOTER_BEGIN_MARK,line):
                        name = line.replace('`','').split()[-1]+'.sql'
                        print name
                        out_file = open(os.path.join(self.output_path,name),'w')
                        out_file.writelines(self.header + prev_line + line)
                        prev_line = line
                        line = self.file_rsrc.next()
                    else:
                        end = True
                else:
                    if first:
                        out_file.write(line)
                    prev_line = line
                    line = self.file_rsrc.next()
        except StopIteration:
            pass

if __name__ == '__main__':
    Main(sys.argv[1],sys.argv[2])


Like Stano suggested the best thing would be to do it at dump time with something like...

mysql -Ne "show databases" | grep -v schema | while read db; do mysqldump $db | gzip > $db.sql.gz; done

Of course, this relies on the presence of a ~/.my.cnf file with

[client]
user=root
password=rootpass

Otherwise just define them with the -u and -p parameters to the mysql and mysqldump call:

mysql -u root -prootpass -Ne "show databases" | grep -v schema | while read db; do mysqldump -u root -prootpass $db | gzip > $db.sql.gz; done

Hope this helps


A "mysqldump file" is just a text file full of SQL statements. As such, you can use any variety of text editors to chop it up how you see fit.

You might be better served by doing a more selective dump in the first place (just one database per file, etc.). If you don't have access to the original database, you could also do a full restore, then use mysqldump again create dumps for the individual databases.

If you just want a quick and dirty solution, a quick google search yields references to a couple tools that might also be useful.


I might do the dump and reload in steps:

  1. Take the dump of table structure with --no-data with dumps per database.
  2. Create the structure in new server
  3. Take the data dump of table with --no-create-info per database level
  4. Now, as have dumps per database, I can split the files even with cut file if some particular file is large.

Note: if you are using MyISAM tables, you can disable the indexes evaluation during step 4 and re-enable it later to make your insert faster.

0

上一篇:

下一篇:

精彩评论

暂无评论...
验证码 换一张
取 消

最新问答

问答排行榜