Batch Process goes out of memory in Spring+Hibernate+JPA
开发者_JS百科I have a batch process which persists 1000000 records one by one. Each record has it's own child tables. I am using Spring 2.5.6, Hibernate and JPA to do that. But after one hour it goes out of memory. Can anybody suggest me what could be wrong in my application?
Here is the code:
Public void processFeeds(List<Objects> feeds){
for(Feed feed : feeds){
Feed feed=getDAOMainFeedService().persist(feed);
//Saving the child information
if(feed.getID()>0) {
for(Address address : feeds.getAddress()){
getDAOAddressService().persist(feed.getID,address);
}
for(PersonalInfo pi: feeds.getPersonalInfo){
getDAOPIService().persist(feed.getID,pi);
}
}
}
}
//service Class code:
public class MainFeedServiceDAOImpl extends JpaDaoSupport implements IVehYmmRevDAO
public Feed persist(Feed feed)
{
try
{
getJpaTemplate().persist(feed);
feed=getJpaTemplate().merge(feed);
getJpaTemplate().flush();
return feed;
}
catch (Exception exception)
{
logger.error("Persit failed",
exception);
throw new DatabaseServiceException(
"Persit failed", exception);
}
}
}
Other DAO classes also have the same MainFeedServiceDAOImpl implementation, which are injected using the Spring to the Database service layer above. Please give some suggestions.
The reason your program is running out of memory is because every object you're inserting is remaining in your session. You need to clear it occasionally.
I'd change the persist method in this way:
public void persist(List<Feed> feeds)
{
int count = 0;
try
{
for (Feed feed : feeds) {
getJpaTemplate().persist(feed);
if (count % 10000 == 0) {
getJpaTemplate().flush();
getJpaTemplate().getEntityManager().clear();
}
count++;
}
}
catch (Exception exception)
{
logger.error("Persist failed", exception);
throw new DatabaseServiceException("Persist failed", exception);
}
}
http://docs.jboss.org/hibernate/core/3.3/reference/en/html/batch.html#batch-inserts
You can follow the same pattern for your addresses and personal info objects, although if you have the right mapping and cascades set you may not need to do that bit at all.
Though batching the calls to persist the feeds would help, it would only delay the occurrence of OOME. The day your system needs to handle more number of feeds, it would run into OOME before it even comes to the persist stage. The batching logic should also be added at the place where the feeds to persist, enter into your system, so you would know the maximum number of objects that are present in the memory at any point of time.
精彩评论