Sorting key-value pairs after map function in mapreduce
I have a file, which contains IP packet headers in text format.
After the map function, each reduce method is called for a particular IP address. I wan开发者_开发知识库t the values in a sorted order, but they are not sorted. The value is basically a line, in which there is timestamp. I want all the values in reduce to be sorted by timestamp.
Please help me where to do that sorting.
Hadoop MapReduce has a feature called "Secondary Sort" which does what you want.
The book "Hadoop the definitive guide" has a pretty good chapter on the subject.
精彩评论