开发者

How to save only non empty reducers' output in HDFS

In my application the reducer saves all the part files in HDFS but I want only the reducer wi开发者_开发技巧ll write the part files whose sizes are not 0bytes.Please let me know how to define it.


It is possible - see the documentation section on "Lazy Output":

http://hadoop.apache.org/mapreduce/docs/current/mapred_tutorial.html#Lazy+Output+Creation

import org.apache.hadoop.mapreduce.lib.output.LazyOutputFormat;
LazyOutputFormat.setOutputFormatClass(job, TextOutputFormat.class); 


If you're using the old API, you can use the NullOutputFormat class:

import org.apache.hadoop.mapred.lib.NullOutputFormat;
conf.setOutputFormat(NullOutputFormat.class);
0

上一篇:

下一篇:

精彩评论

暂无评论...
验证码 换一张
取 消

最新问答

问答排行榜