开发者

MySQL very slow compared to MS Access when inserting hundred of thousands of rows

I am currently adding hundreds of thousands of rows of data to a table first on a MS Access Table then on a MySQL Table.

I first tried with MS Access, it tooks less than 40 seconds. Then I tried exactly with the sam开发者_运维知识库e source and with the same table structure to MySQL and it took 6 min and 40 seconds. That is 1000% slower!!!

So is it a myth that a database server has better performance?


Executing thousands of independent INSERTs is going to run very slowly. Since MySQL is a multi-user, transactional database, there is a lot more going on during each query than Access does. Each INSERT operation on a SQL server goes through the following steps:

  1. Decode and parse the query.
  2. Open the table for writing, establishing locks if necessary.
  3. Insert the new row.
  4. Update the indexes, if necessary.
  5. Save the table to disk.

Ideally, you want to perform steps 1, 2, 4, and 5 as few times as possible. MySQL has some features that will help you.

PREPARE your queries

By preparing a query that you are going to use repeatedly, you perform step 1 just once. Here's how:

PREPARE myinsert FROM 'INSERT INTO mytable VALUES (?, ?, ?)';
SET @id = 100;
SET @name = 'Joe';
SET @age = 34;
EXECUTE myinsert USING @id, @name, @age;
SET @id = 101;
SET @name = 'Fran';
SET @age = 23;
EXECUTE myinsert USING @id, @name, @age;
# Repeat until done
DEALLOCATE PREPARE myinsert; 

Read more about PREPARE at the mysql.com site.

Use transactions

Combine several (or several hundred) INSERTs into a transaction. The server only has to do steps 2, 4, and 5 once per transaction.

PREPARE myinsert FROM 'INSERT INTO mytable VALUES (?, ?, ?)';

START TRANSACTION;
SET @id = 100;
SET @name = 'Joe';
SET @age = 34;
EXECUTE myinsert USING @id, @name, @age;
SET @id = 101;
SET @name = 'Fran';
SET @age = 23;
EXECUTE myinsert USING @id, @name, @age;
# Repeat a hundred times
COMMIT;

START TRANSACTION;
SET ...
SET ...
EXECUTE ...;
# Repeat a hundred times
COMMIT;

# Repeat transactions until done

DEALLOCATE PREPARE myinsert;

Read more about transactions.

Load your table from a file

Instead of doing thousands of INSERTS, do one batch upload of your data. If your data is in a delimited file, such as a CSV, use the LOAD DATA statement.

LOAD DATA LOCAL INFILE '/full/path/to/file/mydata.csv' INTO TABLE `mytable` FIELDS TERMINATED BY ',' LINES TERMINATED BY '\r\n';

Here's a link to the MySQL page on LOAD DATA.


Usually the most important performance aspect with databases is not how quickly you can insert data, but how quickly you can query it. MySQL has, I believe, a more powerful optimizer than MS Access and can make better use of indexes. An example of this is the loose index scan which can give a factor 10 or more speed up for certain types of queries.

Also, the method you use to insert data can have an effect on the time it takes to insert. For example it will typically be faster to use a bulk insert compared to lots of individual insert statements. Also disabling indexes when inserting and enabling them again afterwards can improve performance.


Does MySQL provide any SQL trace tools so you can see what Access is sending it? From my experience with using Access with SQL Server via ODBC I can tell you that Jet makes some seemingly strange decisions with bulk inserts. What it does is send an insert for each record, instead of a batch insert for all the records. This makes it massively slower, but it does mean that it can't tie up the SQL Server with a long update (and corresponding table locks, etc.).

It's dumb from the standpoint of your insert, but smart from the standpoint of being a good client/server citizen -- it's allowing the SQL Server to decide how to serialize the requested commands and interleave them with those from other users. This means locks are shorter than they would be on a bulk insert.

With SQL Server, you can use ADO to do the trick and force it to process the insert as a batch. I don't know if there's any way to do that with MySQL.

One thing to consider:

If the source and destination tables are both in MySQL, a passthrough query should cause it to be handled entirely by MySQL.

0

上一篇:

下一篇:

精彩评论

暂无评论...
验证码 换一张
取 消

最新问答

问答排行榜