Progress rate during map phase (LATE scheduler) - Hadoop
I am trying to开发者_StackOverflow中文版 find out the progress rate of the map tasks. If someone can help me out it will be great !! Thanks !!
There are two ways we monitor the progress of the Map and Reduce on a job.
The first is the web interface.
http://pdhadoop1:50030
where pdhadoop1
is your namenode machine.
The otherway is from inside the job driver, it is possible to output to the console (or elsewhere)
After the job is submitted, we enter a while
loop and check against job.isComplete()
. Inside the loop we do
System.out.println(String.format("Progress of Page views ETL Job %s:", job.getJobID().toString()));
System.out.println(String.format("\tMap : %f, Reduce %F", job.mapProgress(), job.reduceProgress()));
Then we Thread.sleep(60000)
and the loop keeps going until the job is complete.
With both of these I am able to watch the progress of the map and reduce components of a job.
The web interface allows looking at logs and additional useful information. Counters, records, bytes... A very nice feature.
I hope that helps. :)
EDIT: This wiki page http://wiki.apache.org/hadoop/WebApp_URLs has these URLs listed
The Job Tracker can be found at http://localhost:50030
The Task Tracker can be found at http://localhost:50060
The NameNode / Filesystem / log browser can be found at http://localhost:50070
The SecondaryNameNode can be found at http://localhost:50090
I think localhost is dependent on the URL you want to look at. I haven't played with all of them, I generally just use 50030 and 50070; Both of which I point at my namenode.
精彩评论