开发者

hadoop - Where are input/output files stored in hadoop and how to execute java file in hadoop?

Suppose I write a java program and i want to run it in Hadoop, then

  1. where should 开发者_开发技巧the file be saved?
  2. how to access it from hadoop?
  3. should i be calling it by the following command? hadoop classname
  4. what is the command in hadoop to execute the java file?


The simplest answers I can think of to your questions are:

1) Anywhere
2,3,4)$HADOOP_HOME/bin/hadoop jar [path_to_your_jar_file]

A similar question was asked here Executing helloworld.java in apache hadoop


It may seem complicated, but it's simpler than you might think!

  1. Compile your map/reduce classes, and your main class into a jar. Let's call this jar myjob.jar.
    • This jar does not need to include the Hadoop libraries, but it should include any other dependencies you have.
    • Your main method should set up and run your map/reduce job, here is an example.
  2. Put this jar on any machine with the hadoop command line utility installed.
  3. Run your main method using the hadoop command line utility:
    • hadoop jar myjob.jar

Hope that helps.


  1. where should the file be saved?

The data should be saved in "hdfs". You will want to probably load it into the cluster from your data source using something like Apache Flume. The file can be placed anywhere but most home is /user/hadoop/

  1. how to access it from hadoop?

SSH into the hadoop cluster headnode like a standard linux server.

To list your hadoop root hdfs hadoop fs -ls /

  1. should i be calling it by the following command? hadoop classname

You should be using the hadoop command to access your data and run your programs, try hadoop help

  1. what is the command in hadoop to execute the java file?

hadoop -jar MyJar.jar com.mycompany.MainDriver arg[0] arg[1] ...

0

上一篇:

下一篇:

精彩评论

暂无评论...
验证码 换一张
取 消

最新问答

问答排行榜