How to generate Frequent Itemset using MapReduce

Description

Frequent Itemset Mining aims to find the regularities in transaction dataset. Map Reduce maps the presence of set of data items in a transaction and reduces the Frequent Item set with low frequency. The input consists of a set of transactions and each transaction contains several items. The Map function reads the items from each transaction and generates the output with key and value. Key is represented with item and value is represented by 1. After map phase is completed, reduce function is executed and it aggregates the values corresponding to key. From the results, the frequent items are computed on the basis of minimum support value.

Mapper Function

Map Phase input:<k1, v1>

k1-Line no

v1-Transaction

// get items from each transaction

//itemcount set to 1

for each item

k2-item

v2-1

End for

Output(k2, v2)

Reducer Function

//Count the occurrences for each item

// minimum support

Reduce Phase input:<k2, List<v2>>

sum the value for each item occurrence

if( occurrence of an item satisfy minsup)

Emit( Frequent item)

k3-Frequent item

v3-occurrences

Output: <k3,v3>

Input

Frequent itemset generation using MapReduce

Output

First column – Frequent Item

Second column – Frequency of items

Frequent itemset generation using MapReduce