Research Breakthrough Possible @S-Logix

Office Address

Social List

Apache Oozie - Hadoop Open Source Code

Apache Oozie

  • It manages the jobs in Apache Hadoop, which is is a workflow scheduler system.

  • It also provides a mechanism to run the job on a given schedule.

  • It integrates the Hadoop stack with itself to support several out of the box Hadoop jobs such as Java map-reduce, Pig, Sqoop, Hive, Streaming map-reduce, and Distcp and also, system-specific jobs such as shell scripts and Java programs.

  • It is a scalable, reliable, and extensible system.

  • It is executed as a Java web application that runs in a Java servlet container

  • Oozie Bundle provides a way to package multiple coordinators and workflow jobs and to manage the life cycle of jobs.

Two types of OOZIE Jobs

  • 1. Workflow engine: Responsibility of a workflow engine is to store and run work flows composed of Hadoop jobs e.g., MapReduce, Pig, Hive.

  • 2. Coordinator engine: It runs workflow jobs based on predefined schedules and availability of data.


  • It has client API and command line interface which can be used to launch, control, and monitor job from a Java application.

  • Using its Web Service APIs one can control jobs from anywhere.

  • It has provision to execute jobs which are scheduled to run periodically.

  • It has provision to send email notifications upon completion of jobs.