开发者

Sorting key-value pairs after map function in mapreduce

I have a file, which contains IP packet headers in text format.

After the map function, each reduce method is called for a particular IP address. I wan开发者_开发知识库t the values in a sorted order, but they are not sorted. The value is basically a line, in which there is timestamp. I want all the values in reduce to be sorted by timestamp.

Please help me where to do that sorting.


Hadoop MapReduce has a feature called "Secondary Sort" which does what you want.

The book "Hadoop the definitive guide" has a pretty good chapter on the subject.

0

上一篇:

下一篇:

精彩评论

暂无评论...
验证码 换一张
取 消

最新问答

问答排行榜