Adding multiple files to Hadoop distributed cache?
I am trying to add multiple files to hadoop distributed cache. Actuall开发者_如何学Goy I don't know the file names. They will be named like part-0000*
. Can someone tell me how to do that?
Thanks Bala
You can use either the hadoop -put or -copyFromLocal command:
hadoop fs -copyFromLocal /home/hadoop/outgoing/* /your/hadoop/dir
I solved this problem although it maybe a bit late:
FileSystem fs = directoryPath.getFileSystem(getConf());
FileStatus[] fileStatus = fs.listStatus(directoryPath);
for (FileStatus status : fileStatus) {
DistributedCache.addFileToClassPath(status.getPath(), conf);
}
Is this what you wanted to do?
Nothing prevents you from programmatically getting the list of files if they all are in one directory and the adding them one by one, right? Or is your case different?
精彩评论