Better to build or buy a compute grid platform?
I am looking to do some quite processor-intensive brute force processing for string matching. I have run my prototype in a multi-threaded environment and compared the performance to an implementation using Gridgain with a couple of nodes (also multithreaded).
The performance I observed was that my Gridgain implementation performed slower to my multithreaded implementation. It could be the case that there was a flaw in my gridgain implementation, but it was only a prototype, and I thought the results were indicative. So my question is this:
What are the advantages of having to learn and then build an implementation for a particular grid platform (hadoop, gridgain, or EC2 if going hosted - other suggestions welcome), when one could fairly easily put together a lightweight compute grid platform with a much shallower learning curve?...i.e. what do we get for free with these cloud/grid platforms that are worth having/tricky to implement?
(Please note, I don't have any need for a data grid)
Cheers,
-James
(p.开发者_C百科s. Happy to make this community wiki if needbe)
What kind of grid are you dealing with? A dozen hosts running the same OS would be pretty straightforward to run a grid for - all you really have to deal with is sending work to each host, maybe a little load balancing, maybe take into account what to do if a host goes down, maybe deal with distributing new service code to the hosts when you update your service, but if you don't deal with any of those it's not a big deal since the grid is a manageable size. If you're dealing with 1000s of hosts, or with a service that should never be down or have errors due to single hosts going down then you suddenly have to worry about:
- not overloading any single host
- distributing new service code
- detecting when a host isn't responding and not sending it new work, as well as resending whatever it was working on
- possibly working across different OSes and architectures (little vs. big endian)
- energy savings - shutting down hosts during low load and bringing them back up for high load
- scaling - if you add 100 hosts to your grid tomorrow how long does it take to get them connected and working?
- reliability - some services may actually perform calculations on 2-3 different hosts and only return an answer that all the hosts agree on
That's a short list of things that most grid software should do for you if you need it. If you're working on something small or non-critical then by all means, roll your own. If you're working on something that has to work, or is big enough that having any manual steps in a deployment process would be a maintenance nightmare then you probably want to go with something that already exists.
精彩评论