开发者

How to design my mapper?

I have to write a mapreduce job but I do开发者_如何学Pythonnt know how to go about it,

I have jar MARD.jar through which I can instantiate MARD objects. Using which I call the mard.normalize file meathod on it i.e. mard.normaliseFile(bunch of arguments).

This inturn creates certain output file.

For the normalise meathod to run it needs a folder called myMard in the working directory. So I thought that I would give the myMard folder as the in input path to hadoop job, but m not sure if that would help beacuse mard.normaliseFile(bunch of arguments) will search for the myMard folder in the working directory but it will not find it as (**this is what I think) the Mapper will only be able to access the content of files through the "values" obtained from the fileSplit, it cannot give direct access to the files in the myMard folder.

In short I have to execute the follwing code through the MapReduce

File setupFolder = new File(setupFolderName);

setupFolder.mkdirs();



MARD mard = new MARD(setupFolder);

Text valuz = new Text();

IntWritable intval = new IntWritable();

File original = new File("Vca1652.txt");

File mardedxml = new File("Vca1652-mardedxml.txt");

File marded = new File("Vca1652-marded.txt");



mardedxml.createNewFile();

marded.createNewFile();

NormalisationStats stats;

try {

stats = mard.normaliseFile(original,mardedxml,marded,50.0);

//This meathod requires access to the myMardfolder


System.out.println(stats);

} catch (MARDException e) {

// TODO Auto-generated catch block

e.printStackTrace();

}

Please help

0

上一篇:

下一篇:

精彩评论

暂无评论...
验证码 换一张
取 消

最新问答

问答排行榜