Fault Tolerance in MapReduce
I was reading about Hadoop and how fault tolerant it is. I read the HDFS and read how failure of master and slave nodes can be handled. However, i couldnt f开发者_高级运维ind any document that mentions how the mapreduce performs fault tolerance. Particularly, what happens when the Master node containing Job Tracker goes down or any of the slave nodes goes down?
If anyone can point me to some links and references that explains this in detail.
Fault Tolerance of MapReduce layer depends on the hadoop version. For versions before hadoop.0.21, no checkpointing was done and failure of JobTracker would lead to loss of data.
However, versions starting hadoop.0.21, checkpointing was added where JobTracker records its progress in a file. When a JobTracker starts up, it looks for such data, so that it can restart work from where it left off.
FAULT TOLERANCE IN HADOOP
In case the JobTracker does not receive any heartbeat from a TaskTracker for a specified period of time (by default, it is set to 10 minutes), the JobTracker understands that the worker associated to that TaskTracker has failed. When this situation happens, the JobTracker needs to reschedule all pending and in progress tasks to another TaskTracker, because the intermediate data belonging to the failed TaskTracker may not be available anymore.
All completed map tasks need also to be rescheduled if they belong to incomplete jobs, because the intermediate results residing in the failed TaskTracker file system may not be accessible to the reduce task.
A TaskTracker can also be blacklisted. In this case, the blacklisted TaskTracker remains in communication with the JobTracker, but no tasks are assigned to the corresponding worker. When a given number of tasks (by default, this number is set to 4) belonging to a specific job managed by a TaskTracker fails, the system considers that a fault has occurred.
Some of the relevant information in the heartbeats the TaskTracker sends to the JobTracker are: ● The TaskTrackerStatus
● Restarted
● If it is the first heartbeat
● If the node requires more tasks to execute
The TaskTrackerStatus contains information about the worker managed by the TaskTracker, such as available virtual and physical memory and information about the CPU. The JobTracker keeps the blacklist with the faulty TaskTracker and also the last heartbeat received from that TaskTracker. So, when a new restarted/first heartbeat is received, the JobTracker, by using this information, may decide whether to restart the TaskTracker or to remove the TaskTracker from the blacklist
After that, the status of the TaskTracker is updated in the JobTracker and a HeartbeatResponse is created. This HeartbeatResponse contains the next actions to be taken by the TaskTracker . If there are tasks to perform, the TaskTracker requires new tasks (this is a parameter of the Heartbeat) and it is not in the blacklist, then cleanup tasks and setup tasks are created (the cleanup/setup mechanisms have not been further investigated yet). In case there are not cleanup or setup tasks to perform, the JobTracker gets new tasks. When tasks are available, the LunchTaskAction is encapsulated in each of them, and then the JobTracker also looks up for:
-Tasks to be killed
-Jobs to kill/cleanup
-Tasks whose output has not yet been saved.
All this actions, if they apply, are added to the list of actions to be sent in the HeartbeatResponse. The fault tolerance mechanisms implemented in Hadoop are limited to reassign tasks when a given execution fails. In this situation, two scenarios are supported: 1. In case a task assigned to a given TaskTracker fails, a communication via the Heartbeat is used to notify the JobTracker, which will reassign the task to another node if possible. 2. If a TaskTracker fails, the JobTracker will notice the faulty situation because it will not receive the Heartbeats from that TaskTracker. Then, the JobTracker will assign the tasks the TaskTracker had to another TaskTracker. There is also a single point of failure in the JobTracker, since if it fails, the whole execution fails.
The main benefits of the standard approach for fault tolerance implemented in Hadoop consists on its simplicity and that it seems to work well in local clusters However, the standard approach is not enough for large distributed infrastructures the distance between nodes may be too big, and the time lost in reassigning a task may slow the system
The Master node (NameNode) is a single point of failure in hadoop. If it goes down, the system is unavailable.
Slave (Computational) node failures are fine, and anything running on them at the time of failure are simply rerun on a different node. In fact this may occur even if a node is running slowly.
There are some projects / companies looking to eliminate the single point of failure. Googling "hadoop ha" (High availablity) should get you on your way if you're interested.
精彩评论