Description:
Hadoop daemons are processes running in the background to manage the functioning of a Hadoop cluster. They ensure the system's distributed storage, resource management, and data processing functionalities.
NameNode
Role: The NameNode is the master daemon managing HDFS metadata and the file system namespace.
Function: Tracks block locations and handles file operations.
Failure Handling: Relies on the Secondary NameNode or Checkpoint Node for recovery assistance.
DataNode
Role: Stores actual data in HDFS and sends heartbeat signals to the NameNode.
Function: Handles read/write requests and manages block replication.
Failure Handling: Data is replicated to other nodes to maintain fault tolerance.
Secondary NameNode
Role: Performs periodic checkpoints of the HDFS metadata.
Function: Reduces load on the NameNode by merging metadata and edit logs.
JobTracker
Role: Manages MapReduce jobs in the cluster.
Function: Handles job scheduling, progress monitoring, and fault tolerance.
TaskTracker
Role: Executes Map and Reduce tasks on worker nodes.
Function: Reports task progress and resource usage to the JobTracker.
Task Execution: Each TaskTracker runs both Map and Reduce tasks, which it executes in containers. TaskTrackers report the status of their tasks back to the JobTracker periodically.
Failure Handling: If a TaskTracker fails or goes down, the JobTracker will reassign the failed tasks to other available TaskTrackers.