Input to the Mapper in Hadoop
We can provide input files to the mapper as
FileInputFormat.setInputPaths(conf, inputPath);
Is it possible to pass a reference to memory say a DOM tree constructed using a DOM parser after parsing an XML file as an inp开发者_开发问答ut to mapper function of the Hadoop framework.
What other possibilities are there?
No, you can't specify memory (RAM) based information.
The reason is that in general Hadoop applications will be distributed over a lot of physically separated systems. The current version of Hadoop "only" supports distributed data using HDFS ... which is a file system.
What you can do is add the DOM parser as a preprocessing step to your mapper and simply specify your input test file as the input. You can most easily do that by creating your own derivative of FileInputFormat.
HTH
精彩评论