Which Map-Reduce library and/or platform to use with java
I was reading and hearing some stuff about cloud computing and map-reduce techniques lately. I am thinking of playing around with some algorithms to get practical experience in that field and see what is possible right now.
Here is what I want to do: I would like to use some public cloud platform (e.g. Google App Engine, Google Map Reduce, Amazon ECS, 开发者_开发百科Amazon Map Reduce) that comes with built in map reduce functionality or if it comes without built in support, use an additional map reduce java libary (e.g. Hadoop, Hive), and implement/deploy some algorithms.
Has anyone made some experience in that field and indicate a good point to start? Or name some combinations which have worked well in practice?
Thanks in advance!
Amazon EC2 has some pre-bundled Hadoop AMIs. See Running Hadoop on Amazon EC2 for a tutorial.
In particular, the Cloudera distribution comes to mind - it comes with Pig and Hive as well.
Apache Hadoop is a major open-source Java distributed computing framework, and it includes a MapReduce subproject that is based off of the original Google MapReduce.
精彩评论