开发者

How could I programmatically get all the job tracker and tasktracker information that is displayed by Hadoop in the web interface?

I'm using Cloudera's Hadoop distribution CDH-0.20.2CDH3u0. Is there any way I could the information such as 开发者_如何学Cjobtracker status, tasktracker status, counters using a JAVA program running outside of hadoop framework? I tried listening using JMX but hadoop provides very less information regarding Jobtracker, tasktracker and datanode. It doesn't provide any JMX attributes related to running job state like map percent completion, reduce percent completion, task percent completion, attempt percent completion, counters status etc.

Futhermore I tried using the metrics logs dumped by hadoop. But it too doesn't contain any information regarding map/reduce percent completion, task percent completion.

I think, there should be some alternative way to get all those stuffs.

Please do reply.


You can use the Hadoop API to access this information programmatically. In particular, instantiate JobClient with the suitable configuration for your cluster, and then you can use getJob on that instance to get a RunningJob. With that, you should be able to get to the detail you're looking for (following code is completely untested, but in the direction of the right idea I hope):

JobClient theJobClient = new JobClient(new InetSocketAddress("your.job.tracker", 8021), new Configuration());
RunningJob theJob = theJobClient.getJob("job_id_string"); // caution, deprecated
float mapProgress = theJob.mapProgress(); // similar for reduceProgress
// etc (see RunningJob)

You can also get the list of currently-running jobs with theJobClient.jobsToComplete, which returns an array of JobStatus, which should expose similar values (mapProgress, etc), and can provide the JobID instance you could use to get the RunningJob above (if you want to avoid the deprecated method).

Surely there are further options. Start with http://hadoop.apache.org/common/docs/r0.20.2/api/org/apache/hadoop/mapred/JobClient.html for further details.


I am not sure if this is correct but you can try HUE. I think HUE gives information about jobs. Since its open source you can see how they access job tracker and name tracker.

0

上一篇:

下一篇:

精彩评论

暂无评论...
验证码 换一张
取 消

最新问答

问答排行榜