Availability : Apache Hadoop 2.0 and above.
It is a Framework for YARN-based and data processing applications in Hadoop
It allows a complex directed-acyclic-graph of tasks for processing data
Directed Acyclic Graph (DAG) – defines the overall job. Each DAG object represents a job
Vertex – assists in running the user logic with the resources. Each Vertex refers to a step in the job
Edge – defines the connection between producer and consumer vertices.