开发者

Using set/list data types for intermediate keys in Hadoop

In an Apache Hadoop map-reduce program, what are the options for using sets/lists as keys in the output from the mapper?

My initial idea was to use ArrayWritable as key type, but that 开发者_运维知识库is not allowed, as the class does not implement WritableComparable. Do I need to define a custom class, or is there some other set like class in the Hadoop libraries that can act as key?


I thought ArrayWritable implemented Writable which is a superinterface of WritableComparable.

Did you subclass ArrayWritable? According to the documentation you need to subclass it so that you can set the type of object to be stored by the array. For example:

public class TextArrayWritable extends ArrayWritable {

    public TextArrayWritable() {
        super(Text.class);
    }
}

Checkout the ArrayWritable javadocs.

0

上一篇:

下一篇:

精彩评论

暂无评论...
验证码 换一张
取 消

最新问答

问答排行榜