What are the 3 stages of MapReduce

  • map stage
    Input for map stage is a <key value> pair and it produces the <key,value> pair as the output.
  • Shuffle stage
    All intermediate values which are associated with the same intermediate key, are grouped together and given as an input to the reduce stage.
  • Reduce stage
    All intermediate values belongs to the same intermediate keys, are processed and a new set of key value pair is generated as output and it is stored in the HDFS.