Hadoop WordCount example - Implementing Sorting
I'm a Hadoop newbie. I have been able to successfully run the WordCount example.
I would like to modify this example such that my output is sorted in ascending order of count. I'm unable to figure out where I would need to make the necessary changes.
It would be great if someone would give me some direction to impleme开发者_运维百科nt sorting?
See org.apache.hadoop.examples.Sort
This is not super-straightforward to do using map/reduce. It involves taking a histogram of your data and using the TotalOrderPartitioner
.
Alternatively, you can use Hive or Pig, which has built-in functions for sorting.
精彩评论