Custom inputformat to process protobufs in hadoop 0.20
I'd like to process protobufs using hadoop....but am unsure where to start. I don't care about splitting large files. The protobufs are stored as bina开发者_开发知识库ry data...what class should I extend to make it easier
elephant-bird can process protobufs using hadoop. This framework generates hadoop I/O classes along with regular protobuf classes. It uses lzo compression.
精彩评论