Hibernate / MySQL Bulk insert problem
I'm having trouble getting Hibernate to perform a bulk insert on MySQL.
I'm using Hibernate 3.3 and MySQL 5.1
At a high level, this is what's happening:
@Transactional
public Set<Long> doUpdate(Project project, IRepository externalSource) {
List<IEntity> entities = externalSource.loadEntites();
buildEntities(entities, project);
persis开发者_运维百科tEntities(project);
}
public void persistEntities(Project project) {
projectDAO.update(project);
}
This results in n log entries (1 for every row) as follows:
Hibernate: insert into ProjectEntity (name, parent_id, path, project_id, state, type) values (?, ?, ?, ?, ?, ?)
I'd like to see this get batched, so the update is more performant. It's possible that this routine could result in tens-of-thousands of rows generated, and a db trip per row is a killer.
Why isn't this getting batched? (It's my understanding that batch inserts are supposed to be default where appropriate by hibernate).
Pascal's answer is correct. However, because you are using MySQL, I also highly recommend that you try using the rewriteBatchedStatements=true
parameter in your JDBC URL.
This parameter causes the JDBC driver to dynamically re-write your INSERT batches to use a single "multi-valued" INSERT, e.g.:
INSERT INTO mytable (mycol) VALUES (0);
INSERT INTO mytable (mycol) VALUES (1);
INSERT INTO mytable (mycol) VALUES (2);
will be re-written to:
INSERT INTO mytable (mycol) VALUES (0), VALUES (1), VALUES (2);
This can make a significant difference in some cases. See http://www.jroller.com/mmatthews/entry/speeding_up_batch_inserts_for for some example measurements.
As documented in the Chapter 13. Batch processing:
If you are undertaking batch processing you will need to enable the use of JDBC batching. This is absolutely essential if you want to achieve optimal performance. Set the JDBC batch size to a reasonable number (10-50, for example):
hibernate.jdbc.batch_size 20
Hibernate disables insert batching at the JDBC level transparently if you use an identity identifier generator.
Don't forget to flush
and then clear
the session regularly or you'll get OutOfMemoryException
as documented in 13.1. Batch inserts.
But IMO, for tens-of-thousands of rows, you should consider using the StatelessSession
interface.
Pascal has pretty much nailed it in hibernate context. As an alternative, you can use jbdc template's Batchsqlupdate. However I have to warn you that hibernate cached instances may not reflect the changes done using above. In our project, we had to take precautionary measures to overcome this, by creating a different schedule (another problem created, but within our control)
精彩评论