Research Area:  Data Mining
The Knowledge Discovery and Data Mining (KDDM), a growing field of study argued to be very useful in discovering knowledge hidden in large datasets are slowly finding application in Higher Educational Institutions (HEIs). While literature shows that KDDM processes enable discovery of knowledge useful to improve performance of organisations, limitations surrounding them contradict this argument. While extending the usefulness of KDDM processes to support HEIs, challenges were encountered like the discovery of course taking patterns in educational datasets associated with contextual information. While literature argued that existing KDDM processes suffer from the limitations arising out of their inability to generate patterns associated with contextual information, this research tested this claim and developed an artefact that overcame the limitation.
Design Science methodology was used to test and evaluate the KDDM artefact. The research used the CRISP-DM process model to test the educational dataset using attributes namely course taking pattern, course difficulty level, optimum CGPA and time-to-degree by applying clustering, association rule and classification techniques. The results showed that both clustering and association rules did not produce course taking patterns. Classification produced course taking patterns that were partially linked to CGPA and time-to-degree. But optimum CGPA and time-to-degree could not be linked with contextual information. Hence the CRISP-DM process was modified to include three new stages namely contextual data understanding, contextual data preparation and additional data preparation (merging) stage to see whether contextual dataset could be separately mined and associated with course taking pattern.
The CRISP-DM model and the modified CRISP-DM model were tested as per the guidelines of Chapman et al. (2000). Process theory was used as basis for the modification of CRISP-DM process. Results showed that course taking pattern contextualised by course difficulty level pattern predicts optimum CGPA and time-to-degree. This research has contributed to knowledge by developing a new artefact (contextual factor mining in the CRISP-DM process) to predict optimum CGPA and optimum time-to-degree using course taking pattern and course difficulty level pattern.
Contribution to theory was in extension of the application of a few theories to explain the development, testing and evaluation of the KDDM artefact. Enhancement of genetic algorithm (GA) to mine course difficulty level pattern along with course taking pattern is a contribution and a pseudocode to verify the presence of course difficulty level pattern. Contribution to practise was by demonstrating the usefulness of the modified CRISP-DM process for prediction and simulation of the course taking pattern to predict the optimum CGPA and time-to-degree thereby demonstrating that the artefact can be deployed in practise.
Name of the Researcher:  Bhaskaran, Subhashini Sailesh
Name of the Supervisor(s):  Lu K, Swenson A, Alaali M
Year of Completion:  2017
University:  Brunel University London
Thesis Link:   Home Page Url