JPA/Hibernate bulk(batch) insert
Here is simple example I've created after reading several topics about jpa bulk inserts, I have 2 persistent objects User, and Site. One user could have many site, so we have one to many relations here. Suppose I want to create user and create/link several sites to user account. Here is how code looks like, consid开发者_JS百科ering my willing to use bulk insert for Site objects.
User user = new User("John Doe");
user.getSites().add(new Site("google.com", user));
user.getSites().add(new Site("yahoo.com", user));
EntityTransaction tx = entityManager.getTransaction();
tx.begin();
entityManager.persist(user);
tx.commit();
But when I run this code (I'm using hibernate as jpa implementation provider) I see following sql output:
Hibernate: insert into User (id, name) values (null, ?)
Hibernate: call identity()
Hibernate: insert into Site (id, url, user_id) values (null, ?, ?)
Hibernate: call identity()
Hibernate: insert into Site (id, url, user_id) values (null, ?, ?)
Hibernate: call identity()
So, I means "real" bulk insert not works or I am confused?
Here is source code for this example project, this is maven project so you have only download and run mvn install to check output.
UPDATED:
After Ken Liu kindly advise, I've disabled Site object id auto generation:
User user = new User("John Doe");
user.getSites().add(new Site(1, "google.com", user));
user.getSites().add(new Site(2, "yahoo.com", user));
entityManager.setFlushMode(FlushModeType.COMMIT);
EntityTransaction tx = entityManager.getTransaction();
tx.begin();
entityManager.persist(user);
tx.commit();
Now I have following line in debug output:
DEBUG: org.hibernate.jdbc.AbstractBatcher - Executing batch size: 2
It works!
If you're using the database to generate ids, then Hibernate has to execute a query to generate the primary key for each entity.
I have found it much more efficient to bypass hibernate for bulk inserts. You must do away with ORM (object relational mapping) but you can still leverage the connection associated with the current session and the transaction management.
While you temporarily lose the convenience of your ORM, the payoff is significant, especially if you have natively generated Ids since hibernate would normally perform one SELECT
for each INSERT
.
Session.doWork
is very handy for facilitating this.
private MyParentObject saveMyParentObject(final MyParentObject parent, final List<MyChildObject> children)
{
transaction = session.beginTransaction();
try
{
session.save(parent); // NOTE: parent.parentId assigned and returned here
session.doWork(new Work()
{
public void execute(Connection con) throws SQLException
{
// hand written insert SQL - can't use hibernate
PreparedStatement st = con.prepareStatement("INSERT INTO my_child (parent_id, name, ...) values (?, ?, ...)");
for (MyChildObject child : children)
{
MyChildObject child = new MyChildObject();
child.setParentId(parent.getParentId()); // assign parent id for foreign key
// hibernate can't help, determine jdbc parameters manually
st.setLong(1, child.getParentId());
st.setString(2, child.getName());
...
st.addBatch();
}
// NOTE: you may want to limit the size of the batch
st.executeBatch();
}
});
// if your parent has a OneToMany relationship with child(s), refresh will populate this
session.refresh(parent);
transaction.commit();
return parent;
}
catch(Throwable e)
{
transaction.rollback();
throw new RuntimeException(e);
}
}
I have written a short blog which talks about batch insert gotchas and also has pointer to small project that has all the right configurations to get started with batch insert with Hibernate. See the details at http://sensiblerationalization.blogspot.com/2011/03/quick-tip-on-hibernate-batch-operation.html
精彩评论