Improve performance when importing data to MySQL?
I'm using Django to build a website with a MySQL (MyISAM) backend.
The database data is imported from a number of XML files that a开发者_C百科n external script process and output as a JSON-file. Whenever a new JSON file differ from the old one, I need to wipe the old MySQL-db and recreate it using manage.py loaddata, (at least that's the easy way to do it, I guess I could check the differences between the JSON files and apply those to the database, but I haven't figured out a good solution for this (I'm neither a very good coder nor a web developer)).
Anyway, the JSON file is around 10 Mb, and ends up being about 21,000 rows of SQL (It's not expected to grow significantly). There are 7 tables, and they all look something like this:
class Subnetwork(models.Model):
SubNetwork = models.CharField(max_length=50)
NetworkElement = models.CharField(max_length=50)
subNetworkId = models.IntegerField()
longName = models.CharField(max_length=50)
shortName = models.CharField(max_length=50)
suffix = models.CharField(max_length=50)
It takes up to a minute (sometimes only 30 seconds) to import it into MySQL. I don't know if this is to be expected from a file of this size? What can I do (if anything) to improve perfomance?
For what it's worth, here's some profiler output https://gist.github.com/1287847
There are a couple of solutions, same decent than others, but here is a workaround to keep your system's "downtime" minimal, without needing to write a db synchronize mechanism (which would probably be a better solution in most times).:
- Create a custom
settings_build.py
file, withfrom settings import *
that chooses a random name for a new db (probably with the date in the db name), creates it by calling mysqladmin, and update the name intoDATABASES
. - Create a custom django management command (let's call it
builddb
) by either cloning theloaddata
command or calling it, and on successful result, it should write the db name to adbname
text file with one line and executes a shell command that reloads your django (apache/gunicorn/?) server. - Modify your
settings.py
to load the database name from the text file.
And now run your build process like this:
./manage.py builddb --settings=settings_build
I solved it by exporting the processed XML-files to csv instead of json, and then used a separate script that called mysqlimport to do the importing.
精彩评论