How to save only non empty reducers' output in HDFS
In my application the reducer saves all the part files in HDFS but I want only the reducer wi开发者_开发技巧ll write the part files whose sizes are not 0bytes.Please let me know how to define it.
It is possible - see the documentation section on "Lazy Output":
http://hadoop.apache.org/mapreduce/docs/current/mapred_tutorial.html#Lazy+Output+Creation
import org.apache.hadoop.mapreduce.lib.output.LazyOutputFormat;
LazyOutputFormat.setOutputFormatClass(job, TextOutputFormat.class);
If you're using the old API, you can use the NullOutputFormat
class:
import org.apache.hadoop.mapred.lib.NullOutputFormat;
conf.setOutputFormat(NullOutputFormat.class);
精彩评论