Frequent Itemset Mining aims to find the regularities in transaction dataset. Map Reduce maps the presence of set of data items in a transaction and reduces the Frequent Item set with low frequency. The input consists of a set of transactions and each transaction contains several items. The Map function reads the items from each transaction and generates the output with key and value. Key is represented with item and value is represented by 1. After map phase is completed, reduce function is executed and it aggregates the values corresponding to key. From the results, the frequent items are computed on the basis of minimum support value.
Mapper Function
Map Phase input:<k1, v1>
k1-Line no
v1-Transaction
// get items from each transaction
//itemcount set to 1
for each item
k2-item
v2-1
End for
Output(k2, v2)
Reducer Function
//Count the occurrences for each item
// minimum support
Reduce Phase input:<k2, List<v2>>
sum the value for each item occurrence
if( occurrence of an item satisfy minsup)
Emit( Frequent item)
k3-Frequent item
v3-occurrences
Output: <k3,v3>
Input
Output
First column – Frequent Item
Second column – Frequency of items