Hibernate serverside or on each instance of a Swing application?
I would like to know what is the best desing for an existing heavy swing application. This application need a database access (new) and I am using hibernate for that.
This application is running on many computers and need few things :
- ability to 'lock' a record for modifications (an other instance will not be able to edit the records) but I have to find a way to protect from record being locked beacause of a crash or something.
- ability to be notified from database updates events (I am using posrgresql, maybe it can helps)
So my question : where do I have to instanciate hibernate ? on each application instance ? or an unique instance 'server side' and had to code a protocol mecanism (RMI ? EHCache distribuated ? ...)
The problem with the server model, is that I will have to code mecanisms to detect application shutdown / crash.
开发者_如何学JAVAThanks.
To answer your overall question, we first need to explore the question of how heavy your database interaction is.
In general, in a Java EE context, the recommended course of action is to keep all your database interaction on the server and design a useful, usable remote interface which you provide access to via EJB. However, this course of action assumes that it is less expensive to transmit a high-level objective from the client to the server and do all the processing there, in other words, a service API rather than a data API.
It's often useful to examine the recommended patterns of use in Java EE, because there's often a hidden hint in there. Using EJBs, for example, allow you to treat the application server as the gateway to all modification of your data. This gives you a great deal more control and lets you offload a lot of the work you would ordinarily have to do yourself, such as managing transactions, onto the container. It also alleviates the need for fancy locking strategies—for the most part.
From a concurrency perspective, there are pros and cons to each approach.
Server-side Pros
- No need to worry about cache synchronization with clients
- Simplified transaction handling
- Lots of existing infrastructure (EJB)
- No direct access to the database from a malicious client (no need for clients to have database passwords, for example)
Client-side Pros
- No detached objects or data-transfer object woes
- Simplified programming model
- No need for a fancy Java EE application server
PostgreSQL has support for explicit row locks by issuing a SELECT FOR UPDATE
against the rows you want to lock. This prevents other processes from issuing UPDATE
s or DELETE
s against the rows you have locked. However, this is a definite code smell. PostgreSQL has extremely powerful algorithms for handling concurrency. In general, I find people wanting to handle locks in the database explicitly usually do not understand how databases handle concurrency and it almost always leads to pain. So some questions you should ask yourself before proceeding with an application design that requires explicit locks are: why am I locking the data? Why am I worried about other processes accessing these rows at the same time? Is it really bad for other processes to update these rows at the same time? Often I find that this is an indication that there should be another entity and rows should be owned by particular users instead.
Another important point is that locks in PostgreSQL are only live as long as the transaction is open. Long-running transactions are a very bad idea, because they increase the possibility of deadlock. If PostgreSQL detects deadlock, it will pick one of the connections and kill it, and odds are very low it will kill the one you want. Hibernate also abhors long-running transactions. Your temptation to move access to the database to a Swing client, odds are very good you are planning on allowing extremely long-running transactions, either explicitly with PostgreSQL directly or implicitly by holding Hibernate Sessions or EntityManagers open for long periods of time. These are both really bad ideas that introduce a lot of memory overhead and room for concurrency problems.
Additionally, I would be outright shocked if Hibernate cooperated nicely with manually acquired locks. Once Hibernate is involved with your database, I would consider it a very bad idea to expect to retain any amount of control over aspects of the database having to do with transactions. Hibernate really expects to own the process and is, in a sense, hijacking it for its own purposes; mucking around with the same connection directly will lead to pain sooner or later.
There are two pieces of good news to be salvaged from all of this bad news:
- PostgreSQL has a neat notification facility that can be used to synchronize between different processes. I have no idea how one would go about using this with Hibernate though.
- PostgreSQL will not retain a lock longer than a transaction lasts. If your connection acquires a lock and then disconnects, the lock is forfeited. You don't need to worry about that. (Even if this weren't the case, PostgreSQL will notice deadlock and murder one of the transactions to allow work to proceed).
In the event you decide to allow distributed clients to directly access the database with long-running transactions, you may want to investigate using Terracotta for distributed 2nd-level caching. This is a way to keep a distributed set of processes to share a cache, effectively.
In Conclusion
It sounds from the way the question is worded as though you want a lot of explicit control over the database while also using Hibernate. This indicates to me that there is a confusion of needs. Hibernate is a database abstraction layer; it is about hiding the kinds of details you are actively interested in. Moreover, PostgreSQL itself is also an abstraction, particularly over concurrency concerns. I think you would probably find it easier to implement the design you seem to be requiring over a flat file with a custom server to support it, which is a good indication that either your needs are very specialized and poorly matched to the candidate technologies, or you misunderstand the candidate technologies. I see a lot more misunderstanding than special needs (though it happens).
Relational databases are not just a persistence mechanism: they are also an integration point that manages data integrity. ACID compliance implies that the database will itself ensure that data is not left in a corrupted state by multiple processes writing the same things at the same time. Moreover, because multiple processes can be changing the data, you must write your software aware that the data it has just fetched from the database is already potentially out-of-date. This is why Hibernate performs all of its SELECT
s inside a transaction: it's the only way to get a coherent snapshot of a system that is currently undergoing change.
Writing your application with explicit locking implies that you and your application think of the database as being merely some kind of storage which you can use as a synchronization point, by preventing or allowing other processes to make changes. I assure you that locks in a relational database context are the enemy of both performance and frequently of data integrity as well. Additionally, they usually don't work as you expect or need; for example, even PostgreSQL's highest row lock level will not prevent another process from examining the current value the row has, and your changes to that value will not take effect until the transaction is committed. This is very different from the way locks work in a traditional multi-threaded process with ordinary variables and ordinary locks. Usually if you think you need locks in a relational database, you either don't, or you need to rethink your application's design so as not to need them, or perhaps you simply need to rethink your database design to eliminate the need.
In any case, the scenario you describe is a very poor fit for distributed computing of any kind, and an especially poor fit for Hibernate.
精彩评论