java clustering for huge sequential calculation
I have data items 1,2,3 ..... n I need to run a sequential calculation with all the data items. the value of n is very large, about 600,000 or more. the data is taken from a text file that is usually more than 2GB in size
I have java program to perform the calculation in a loop.The processing time usually takes more than 24 hours. I need to use clusters to minimize the processing time and to distribute the job to different cluster nodes.
currently i am performing parallel processing in my local computer with 4 cpu cores. The work is spitted into pieces and given to 4 cores. when one core finishes a piece of the work, next pieces is loaded. So, there will be a 开发者_开发百科queue and 4 cores processed the queue in parallel.
Which cluster application is good for java in the enterprise level ? Do i need to change my program code? Does the cluster program take care without modifying the java code? How can i split the job and distribute the job to different clusters? Do i need to upload data file to all the cluster nodes?
I will be greatly thankful to your help.
Instead of using a local queue you could use a JMS Queue. ActiveMQ is a simple to use JMS server. You could have any number of listener nodes and you would just add tasks to this queue.
Have you considered Infinispan? You could load up your data into Infinispan and it gets distributed across a cluster, then run your calculation as a Map/Reduce task across this cluster. See http://infinispan.blogspot.com/2011/01/introducing-distributed-execution-and.html as well.
精彩评论