How to create and read directories in Hadoop - Mapreduce Job working directory
I want to create a directory inside the working directory of a MapRedu开发者_运维百科ce job in Hadoop.
For example by using: File setupFolder = new File(setupFolderName); setupFolder.mkdirs();
in my mapper class to write some intermediate files in it. Is it the right way to do it?.
Also after completion of the job how will I access this directory again if I wish so?
Please advice.
If you are using java, you can override the setup
method and open the file handler there ( and close it in cleanup
) . This handle will be available to all mappers.
I am assuming that you are not writing all the map output here but some debug/stats. With this handler you can read and write as it is show in this example ( http://wiki.apache.org/hadoop/HadoopDfsReadWriteExample )
if you want to read the whole directory, check out this example https://sites.google.com/site/hadoopandhive/home/how-to-read-all-files-in-a-directory-in-hdfs-using-hadoop-filesystem-api
remember that you will not be able to depend on the the order of data written to the files.
You can override setupReduce() in reducer class, use mkdirs() to create folder and use create() to create file for outputstream.
@Override
protected void setupReduce(Context context) throws IOException {
Configuration conf = context.getConfiguration();
FileSystem fs = FileSystem.get(conf);
fs.mkdirs(new Path("your_path_here"));
}
精彩评论