STREAM keyword in pig script that runs in Amazon Mapreduce
I have a pig script, that activates another python program. I was able to do so in my own hadoop environment, but I always fail when I run my script in Amazon map reduce WS.
The log say:
org.apache.pig.backend.executionengine.ExecException: ER开发者_StackOverflow社区ROR 2090: Received Error while processing the reduce plan: '' failed with exit status: 127 at org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigMapReduce$Reduce.runPipeline(PigMapReduce.java:347) at org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigMapReduce$Reduce.processOnePackageOutput(PigMapReduce.java:288) at org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigMapReduce$Reduce.reduce(PigMapReduce.java:260) at org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigMapReduce$Reduce.reduce(PigMapReduce.java:142) at org.apache.hadoop.mapred.ReduceTask.run(ReduceTask.java:321) at org.apache.hadoop.mapred.TaskTracker$Child.main(TaskTracker.java:2216)
Any Idea?
Have you made sure that the script is sent along to the Elastic MapReduce job?
Problem solved! All I need is to use the cache('s3://') option when defining the streaming command
精彩评论