
<k1, v1>k1: Line number v1: Record
<k2, v2>k2: Attribute with class label v2: 1 (for each occurrence of the attribute with class label)
<k2, List<v2>><k3, v3>k3: Attribute with class label v3: Frequency (count of occurrences)
<k3, v3><k4, v4>k4: Attribute v4: Entropy, Information Gain, and Split Information
<k4, List<v4>><k5, v5>k5: Decision node (best splitting attribute) v5: Information Gain Ratio
<k5, v5><k6, v6>k6: Node ID v6: Elements (attribute values)