JPA persistence using multiple threads
I have a problem when I try to persist objects using multiple threads.
Details :
Suppose I have an object PaymentOrder
which has a list of PaymentGroup
(One to Many relationship) and PaymentGroup
contains a list of CreditTransfer
(One to Many Relationship again).
Since the number of CreditTransfer
is huge (in lakhs), I have grouped it based on PaymentGroup
(based on some business logic)
and creating WORKER t开发者_JAVA百科hreads(one thread for each PaymentGroup) to form the PaymentOrder
objects and commit in database.
The problem is, each worker thread is creating one each of PaymentOrder
(which contains a unique set of PaymentGroup
s).
The primary key for all the entitties are auto generated.
So there are three tables, 1. PAYMENT_ORDER_MASTER, 2. PAYMENT_GROUPS, 3. CREDIT_TRANSFERS, all are mapped by One to Many relationship.
Because of that when the second thread tries to persist its group in database, the framework tries to persist the same PaymentOrder
, which previous thread committed,the transaction fails due to some other unique field constraints(the checksum of PaymentOrder
).
Ideally it must be 1..n..m (PaymentOrder
->PaymentGroup-->
CreditTransfer`)
What I need to achieve is if there is no entry of PaymentOrder
in database make an entry, if its there, dont make entry in PAYMENT_ORDER_MASTER
, but only in PAYMENT_GROUPS
and CREDIT_TRANSFERS
.
How can I ovecome this problem, maintaining the split-master-payment-order-using-groups logic and multiple threads?
You've got options.
1) Primitive but simple, catch the key violation error at the end and retry your insert without the parents. Assuming your parents are truly unique, you know that another thread just did the parents...proceed with children. This may perform poorly compared to other options, but maybe you get the pop you need. If you had a high % parents with one child, it would work nicely.
2) Change your read consistency level. It's vendor specific, but you can sometimes read uncommitted transactions. This would help you see the other threads' work prior to commit. It isn't foolproof, you still have to do #1 as well, since another thread can sneak in after the read. But it might improve your throughput, at a cost of more complexity. Could be impossible, based on RDBMS (or maybe it can happen but only at DB level, messing up other apps!)
3) Implement a work queue with single threaded consumer. If the main expensive work of the program is before the persistence level, you can have your threads "insert" their data into a work queue, where the keys aren't enforced. Then have a single thread pull from the work queue and persist. The work queue can be in memory, in another table, or in a vendor specific place (Weblogic Queue, Oracle AQ, etc). If the main work of the program is before the persistence, you parallelize THAT and go back to a single thread on the inserts. You can even have your consumer work in "batch insert" mode. Sweeeeeeeet.
4) Relax your constraints. Who cares really if there are two parents for the same child holding identical information? I'm just asking. If you don't later need super fast updates on the parent info, and you can change your reading programs to understand it, it can work nicely. It won't get you an "A" in DB design class, but if it works.....
5) Implement a goofy lock table. I hate this solution, but it does work---have your thread write down that it is working on parent "x" and nobody else can as it's first transaction (and commit). Typically leads to the same problem (and others--cleaning the records later, etc), but can work when child inserts are slow and single row insert is fast. You'll still have collisions, but fewer.
Hibernate sessions are not thread-safe. JDBC connections that underlay Hibernate are not thread safe. Consider multithreading your business logic instead so that each thread would use it's own Hibernate session and JDBC connection. By using a thread pool you can further improve your code by adding ability of throttling the number of the simultaneous threads.
精彩评论