What are requirements for a web application to work in a cluster environment
I need to check if existing web application is rea开发者_StackOverflow社区dy the be deployed in a clustered environment.
Cluster: Several Linux boxes. The flow is controlled by a load balancer that is using simple round robin algorithm with sticky session. Application Stateless (hopefully) java web application that retrieves content from back office and format it appropriately. I have access to the source code. What should I check in the code to be sure that it will run in the cluster?- Check that something is not cached in a memory or file system that stores state of the application.
- ...Something else?
If you're using EJBs (which is recommended if you access a DB), then here is a list of restrictions:
http://java.sun.com/blueprints/qanda/ejb_tier/restrictions.html
I guess similar restrictions apply to the web application.
The easiest way to check the application is to start by having it running on 2 servers with the same data so at startup both are in the same state. Let's assume for a user to complete an operation, the browser will make 2 consecutive HTTP requests to your web app -- what you need to do is hit webserver 1 with first call and web server 2 with second call; then try the other way around, then with both requests going to the same webserver -- and if you get the same result each time you're very likely you have ready-to-cluster application. (It doesn't mean the app IS ready to cluster as there might be object states etc it stores in memory which are not easy to spot from the front-end, but it gives you a higher probability that IT MIGHT BE ok to run in a cluster.)
If its truly "stateless", there would be no problem, you could make any request of any server at any time and everything would just work. Most things aren't quite that easy so any sort of state would either have to be streamed to and from the page as it moves from client to server, or be stored on the back end, and have some sort of token passed back and forth in order to retrieve it from whatever shared data store you're using for that. If they are using the HttpSession, then anything that is retrieved from the session, if modified, needs to be set back into the session with session.setAttribute(key,value). This setting the attribute acts as a signal that whatever is being stored in the session needs to be replicated to the redundant servers. Make sure anything stored in the session implements, and actually is, Serializable. Some servers will allow you to store objects, (I'm looking at you weblogic), but will then throw an exception when it tries to replicate the object. I've had many a coworker complain that having to set stuff back to the session should be redundant, and perhaps it should, but this is just the way things work.
Having state is not a big problem if done properly. Anyway, all applications have state. Even if serving somewhat static file, the file content associated with an URL is indeed part of the state.
The problem is how this state is propagated and shared.
- state inside user session is a no brainer. Use a session replication mechanism (slower but no session loss on node crash) or session sticky load balancer and your problem is solved.
All other shared state is indeed a problem. In particular even cache state must be shared and perfectly coherent otherwise a refresh on the same page could generate different result on random depending on witch web server, and thus the cache you hit.
You can still cache data using a shared cached (like ehcache), or failing back to session sticky.
I guess it is pretty difficult to be sure that the application will indeed work in a clusterised environement because a singleton in some obscure service, a static member somewhere, anything can potentially produce strange results. You can validate the general architecture for sure, but you'll need to do in reality and perform some validation test before going into production.
精彩评论