MapReduce on Hbase using Thrift in .NET?
Can i use Hadoop Streaming to Run MapReduce jobs on HBase using thrift in .NET? Or is there any other way to run MapReduce j开发者_如何学Goobs on HBase from .NET?
You could also use the REST API (stargate). However, neither the thrift or stargate servers are good ways to run a MapReduce jobs. Both require a separate daemon process that would be a single point of contention and wouldn't provide data locality. The java mapreduce api identifies data local regions for input splits. So the key is to use the java api with .Net. This question provides a third party enhancement to streaming api for hbase, which would allow you use a .Net app via stdin/stdout.
I have successfully achieved this. So, the answer is yes it can be done.
Edit
I don't know why down votes, the question has the answer but following is how I achieved it:
Thrift is more light weight than REST API and in some scenarios gives more performance than java api, I've used Hadoop Streaming API and give it my own Mapper implementation which uses Thrift to communicate with hbase e.g.
bin/hadoop jar contrib/streaming/hadoop-*-streaming.jar -input input/sample.txt -output output -mapper input/StdInOut.exe -reducer NONE
精彩评论