map reduce word count example doesn't work
I try to implement the word count example by myself, here's my implementation of the mapper:
public static class Map extends Mapper<LongWritable, Text, Text, IntWritable> {
public void map(LongWritable key, Text value, Context context) throws IOException, InterruptedException {
Text word = new Text();
String line = value.toString();
StringTokenizer tokenizer = new StringTokenizer(line);
while (tokenizer.hasMoreTokens()) {
word.set(tokenizer.nextToken());
context.write(word, new IntWritable(1));
}
}
}
and reducer:
public static class Reduce extends Reducer<Text, IntWritable, Text, IntWritable> {
public void reduce(Text key, Iterator<IntWritable> values, Context context) throws IOException, InterruptedException {
int sum = 0;
while (values.hasNext())
sum += values.next().get();
context.write(key, new IntWritable(sum));
}
}
but the output I get for executing this code looks like the output of mapper only, for example, if the input is "hello wo开发者_开发百科rld hello", the output would be
hello 1
hello 1
world 1
I also use combiner between mapping and reducing. Can anyone explain me what's wrong with this code?
Thanks a lot!
Replace you reduce method with this one:
@Override
protected void reduce(Text key, java.lang.Iterable<IntWritable> values, org.apache.hadoop.mapreduce.Reducer<Text, IntWritable, Text, IntWritable>.Context context) throws IOException,
InterruptedException {
int sum = 0;
for (IntWritable value : values) {
sum += value.get();
}
context.write(key, new IntWritable(sum));
}
So bottom line is you're not overriding the correct method. The @Override helps with this kind of errors.
Also make sure you set Reduce.class as reduce class and not Reducer.class !
;) HTH Johannes
If you don't want to play with args of reduce method while overriding than alternate solution can be:
@Override
protected void reduce(Object key, Iterable values, Context context) throws
IOException, InterruptedException {
int sum = 0;
Iterable<IntWritable> v = values;
Iterator<IntWritable> itr = v.iterator();
while(itr.hasNext()){
sum += itr.next().get();
}
context.write(key, new IntWritable(sum));
}
精彩评论