开发者

How to make your mapper write on local file system in hadoop

I wish to write a file and create a directory in my local file system through m MapReduce code. Also if I create a directory in the working directory during the job execution, how can I mo开发者_如何转开发ve it to my local file system before the cleanup.


As your mapper runs on some/any machine in your cluster, of course you can use basic Java file operations to write files. You can use org.apache.hadoop.hdfs.DFSClient to access any files on the HDFS to copy to a local file (I'd suggest you copy inside the HDFS and fetch any files from it after the jobs are finished).

Of course your local files will be local to the client-machine (I assume separate machines), so something like NFS will be needed to let the written files be available to you on any client. Watch out for concurreny problems.


I'm interested as well on writing files locally on the datanode. For that, I used java.io.FileWriter and java.io.BufferedWriter:

FileWriter fstream = new FileWriter("log.out",true);
BufferedWriter bout = new BufferedWriter(fstream);               
bout.append(build.toString());
bout.close();

It only creates the file when is executed through eclipse. When run as a .jar with the next command:

hadoop jar jarFile.jar Mainclass  

it doesn't create anything. I don't know whether it is a problem of a misexecution, misconfiguration or just that sth is missing

Actually this is only to create a log file for debugging. The actual files I want the datanode to write locally are created through Runtime.getRuntime(). However, the same thing happens. If the execution is carried out through eclipse it's ok. Outside eclipse, it seems fine but no file is ever created.

Before doing it on a cluster it should work on a single node, so the whole thing is donde on a single computer for now.

0

上一篇:

下一篇:

精彩评论

暂无评论...
验证码 换一张
取 消

最新问答

问答排行榜