开发者

hadoop not running in the multinode cluster

I have a jar file "Tsp.jar" that I made myself. This same jar files executes well in single node cluster setup of hadoop. However when I run it on a cluster comprising 2 machines, a laptop and desktop it gives me an exception when the map function reach 50%. Here is the output

`hadoop@psycho-O:/usr/local/hadoop$ bin/hadoop jar Tsp.jar clust-Tsp_ip1 clust_Tsp_op4
11/04/27 16:13:06 WARN mapred.JobClient: Use GenericOptionsParser for parsing the arguments. Applications should implement Tool for the same.
11/04/27 16:13:06 WARN mapred.JobClient: No job jar file set.  User classes may not be found. See JobConf(Class) or JobConf#setJar(String).
11/04/27 16:13:06 INFO mapred.FileInputFormat: Total input paths to process : 1
11/04/27 16:13:06 INFO mapred.JobClient: Running job: job_201104271608_0001
11/04/27 16:13:07 INFO mapred.JobClient:  map 0% reduce 0%
11/04/27 16:13:17 INFO mapred.JobClient:  map 50% reduce 0%
11/04/27 16:13:20 INFO mapred.JobClient: Task Id : attempt_201104271608_0001_m_000001_0, Status : FAILED
java.lang.RuntimeException: java.lang.RuntimeException: java.lang.ClassNotFoundException: Tsp$TspReducer
    at org.apache.hadoop.conf.Configuration.getClass(Configuration.java:841)
    at org.apache.hadoop.mapred.JobConf.getCombinerClass(JobConf.java:853)
    at org.apache.hadoop.mapred.Task$CombinerRunner.create(Task.java:1100)
    at org.apache.hadoop.mapred.MapTask$MapOutputBuffer.<init>(MapTask.java:812)
    at org.apache.hadoop.mapred.MapTask.runOldMapper(MapTask.java:350)
    at org.apache.hadoop.mapred.MapTask.run(MapTask.java:307)
    at org.apache.hadoop.mapred.Child.main(Child.java:170)
Caused by: java.lang.RuntimeException: java.lang.ClassNotFoundException: Tsp$TspReducer
    at org.apache.hadoop.conf.Configuration.getClass(Configuration.java:809)
    at org.apache.hadoop.conf.Configuration.getClass(Configuration.java:833)
    ... 6 more
Caused by: java.lang.ClassNotFoundException: Tsp$TspReducer
    at java.net.URLClassLoader$1.run(URLClassLoader.java:202)
    at java.security.AccessController.doPrivileged(Native Method)
    at java.net.URLClassLoader.findClass(URLClassLoader.java:190)
    at java.lang.ClassLoader.loadClass(ClassLoader.java:307)
    at sun.misc.Launcher$AppClassLoader.loadClass(Launcher.java:301)
    at java.lang.ClassLoader.loadClass(ClassLoader.java:248)
    at java.lang.Class.forName0(Native Method)
    at java.lang.Class.forName(Class.java:247)
    at org.apache.hadoop.conf.Configuration.getClassByName(Configuration.java:762)
    at org.apache.hadoop.conf.Configuration.getClass(Configuration.java:807)
    ... 7 more

11/04/27 16:13:20 WARN mapred.JobClient: Error reading task outputemil-desktop
11/04/27 16:13:20 WARN mapred.JobClient: Error reading task outputemil-desktop
^Z
[1]+  Stopped                 bin/hadoop jar Tsp.jar clust-Tsp_ip1 clust_Tsp_op4

h开发者_如何学编程adoop@psycho-O:~$ jps
4937 Jps
3976 RunJar

` Alse the cluster worked fine executing the wordcount example. So I guess its the problem with the Tsp.jar file.

1) Is it necessary to have a jar file to run on a cluster?

2) Here I tried to run a jar file in the cluster which I made. But is still gives a warning that jar file is not found. Why is that?

3) What all should be taken care of when running a jar file? Like what all it must contain other than the program which I wrote? My jar file contains a a Tsp.class, Tsp$TspReducer.class and a Tsp$TspMapper.class. The terminal says it cant find Tsp$TspReducer when it is already there in the jar file.

Thankyou

EDIT

public class Tsp {
    public static void main(String[] args) throws IOException {
    JobConf conf = new JobConf(Tsp.class);
    conf.setJobName("Tsp");
    conf.setOutputKeyClass(Text.class);
    conf.setOutputValueClass(Text.class);
    conf.setMapperClass(TspMapper.class);
    conf.setCombinerClass(TspReducer.class);
    conf.setReducerClass(TspReducer.class); 
    FileInputFormat.addInputPath(conf,new Path(args[0]));
    FileOutputFormat.setOutputPath(conf,new Path(args[1]));
    JobClient.runJob(conf);
    }
    public static class TspMapper extends MapReduceBase
    implements Mapper<LongWritable, Text, Text, Text> {
    function findCost() {
    }
    public void map(LongWritable key,Text value, OutputCollector<Text, Text> output, Reporter reporter) throws IOException {
        find adjacency matrix from the input;
        for(int i = 0; ...) {
        .....
        output.collect(new Text(string1), new Text(string2));
        }
    }
    }    
    public static class TspReducer extends MapReduceBase implements Reducer<Text, Text, Text, Text> { 
    Text t1 = new Text();
    public void reduce(Text key, Iterator<Text> values, OutputCollector<Text, Text> output, Reporter reporter) throws IOException {
        String a;
            a = values.next().toString();
            output.collect(key,new Text(a));
    }
    }
}


You currently have

conf.setJobName("Tsp");
conf.setOutputKeyClass(Text.class);
conf.setOutputValueClass(Text.class);
conf.setMapperClass(TspMapper.class);
conf.setCombinerClass(TspReducer.class);
conf.setReducerClass(TspReducer.class); 

and as the error is stating No job jar file set you are not setting a jar.

You will need to something similar to

conf.setJarByClass(Tsp.class);

From what I'm seeing, that should resolve the error seen here.


11/04/27 16:13:06 WARN mapred.JobClient: No job jar file set.  User classes may not be found. See JobConf(Class) or JobConf#setJar(String).

Do what they say, when setting up your job, set the jar where the class is contained. Hadoop copies the jar into the DistributedCache (a filesystem on every node) and uses the classes out of it.


I had the exact same issue. Here is how I solved the problem(imagine your map reduce class is called A). After creating the job call:
job.setJarByClass(A.class);

0

上一篇:

下一篇:

精彩评论

暂无评论...
验证码 换一张
取 消

最新问答

问答排行榜