开发者

Executing helloworld.java in apache hadoop

can someone pls tell me how can i execute my HelloWorld.java in apache hadoop which contains

class Helloworld  
{  
  public static void main(String[] args)  
   {  
    Sytem.out.println("HelloWorld");  
    }  
 }

Am running a single node. Kindly tell me how can i run this piece of code or pls a send a link 开发者_C百科which is understandable for an absolute beginner.


The way a jar is run in hadoop is by the command

$HADOOP_HOME/bin/hadoop jar [your_jar_file]

You can also use -jar to force it to run as a local job. Useful for playing and debugging.

While I haven't tested with such a simple application, I think it should print the line and then be done. Don't hold me to that though. :-P

You might need to specify main throws Exception but I'm not 100% on that. My code has it.

I hope that helps. As mentioned in other answers, without getting into setting up Jobs and MapReduce, there's not going to be a gain from Hadoop.


As far as I understand the apache hadoop is irrelevant in your case. You question is "how to run hello world written in java"?

If my assumption is correct, do the following.

  1. install JDK
  2. compile your java code using command javac Helloworld.java. You have to run this from directory where your code is. The JAVA_HOME/bin should be in your path.
  3. If #2 succeeded you should be able to see Helloworld.class at your working directory. Now run it by typing java Helloworld

Search for any java tutorial for beginners for details. good luck.


Short answer: You don't.

Hadoop doesn't run java applications in the general sense. It runs Map Reduce jobs, which can be written in java, but don't have to be. You should probably start with reading some of the apache hadoop documentation. Here's the Map Reduce tutorial. You might also want to look at Tom White's book "Hadoop: the definitive guide".

Hadoop is a batch oriented large scaled data processing system. It's really only suited to applications in that problem space. If those aren't the kind of problems you're trying to solve, Hadoop isn't what you're looking for.


Since this is an ancient question and many people have already provided the answer for the questions, my answer is for beginners like me who accidentally jump into this link, while they are finding a way to run Hello World in Hadoop.

Yes, Hadoop is run on JVM. But just because of that you don't need Hadoop to run this kind of simple application. Hadoop is for distributed processing. That means assuming you have a massive dataset and your innocent computer is not capable of processing that massive data set. Then what you are going to do is get help from n number of innocent(commodity) computers which will capable of doing this task together.

In the Hadoop environment, we are using a framework called Map-Reduce in order to do this kind of task. So obviously if you are not using the Map-Reduce framework in the Hadoop environment, it's like you are using a Space ship to climbing up to your rooftop instead of a ladder.

Even though this is the common hello world code for almost every programming language, this is not the hello world program for Hadoop. Here you have a program called Word-Count, which will count the number of occurrences in each word in a large text file or in an n number of files.

Word-Count program (Hadoop HelloWorld)

Also, there are 3 modes that you can run this program.

  1. Local (Standalone) mode
  2. Pseudo Distributed mode
  3. Fully Distributed mode

My advice is to try to run the Word-Count program in Pseudo distributed mode as a beginner.


You need to look at how Map Reduce works. You may want to look at the src of the hadoop examples to get a feel of how the Map Reduce programs are written.


Standalone Operation By default, Hadoop is configured to run in a non-distributed mode, as a single Java process. This is useful for debugging.

The following example copies the unpacked conf directory to use as input and then finds and displays every match of the given regular expression. Output is written to the given output directory.

$ mkdir input 
$ cp conf/*.xml input 
$ bin/hadoop jar hadoop-*-examples.jar grep input output 'dfs[a-z.]+' 
$ cat output/*

see here: http://hadoop.apache.org/docs/r0.18.2/quickstart.html

0

上一篇:

下一篇:

精彩评论

暂无评论...
验证码 换一张
取 消

最新问答

问答排行榜