PHD Research Proposal in Anomaly Detection by Applying the Machine Learning Technique

Anomaly detection is the technique that helps to discover the unusual pattern in the data that fails to have the expected behavior referred to as outliers [1]. The massive volume of data and the activities in the system induces the challenges to detect the anomalies. The probabilities of anomaly detection are unbounded analytics system struggled to explore the huge data to spot the anomalies of the system. There is a necessity of scalable systems that seeks to automate the entire process. The anomalies arise out very rarely, but reveal the vast and consequential threat in the system like cyber intrusions or fraud [2].
The machine learning algorithm enables the anomaly detection system (ADS) with the limited human intervention to accomplish the efficient anomaly detection system. The anomaly detection system based on the machine learning techniques that assist the companies in enhancing the accuracy by identifying the anomalies in an automatic manner [3]. The anomaly detection system discovers the extensive use of a diverse range of applications include fraud detection in credit card transaction, health care or insurance, fault identification in securing critical systems, and military surveillance [4].
A typical machine learning based anomaly detection discriminate the anomalies from normal behavior. However, determining the normal region of data among the feasible normal behavior is complex. Also, the normal behavior of the data to evolve across the time that induces the issue during discrimination. Moreover, the imperfect data in the training data of the classification model makes anomaly detection much more difficult. Furthermore, the identification of anomaly along with the inadequate publicly available data set makes the anomaly detection as a more challenging task. The lack of description of the anomaly category intends to complexity while discriminating the new arrival of data. The storing and retaining the required massive data and construct the computational capabilities for a long period is not economically possible.
In the anomaly detection, the performance of the detection system relies on the quality of training data, since learning from the erroneous sample severely degrade the accuracy. Therefore, there is a necessity of constructing the well-labeled training data for the learning of a machine learning algorithm to identify the anomalies. Despite, often the training data of machine learning algorithm encompasses the noise that acts like an actual anomaly in many cases, that discriminates anomalies as a complex task. Also, the overfitting and the curse of dimensionality are the other issues of the anomaly detection system while classifying the behavior of the data.


  • [1] Gupta, Manish, Jing Gao, Charu C. Aggarwal, and Jiawei Han, “Outlier detection for temporal data: A survey”, IEEE Transactions on Knowledge and Data Engineering, Vol.26, No.9, pp.2250-2267, 2014.

  • [2] Habeeb, Riyaz Ahamed Ariyaluran, Fariza Nasaruddin, Abdullah Gani, Ibrahim Abaker Targio Hashem, Ejaz Ahmed, and Muhammad Imran, “Real-time big data processing for anomaly detection: A Survey”, International Journal of Information Management, 2018. [3] Omar, Salima, Asri Ngadi, and Hamid H. Jebur, “Machine learning techniques for anomaly detection: an overview”, International Journal of Computer Applications, Vol.79, No.2, 2013.

  • [4] Chandola, Varun, Arindam Banerjee, and Vipin Kumar, “Anomaly detection: A survey”, ACM computing surveys (CSUR), Vol.41, No.3, pp.15, 2009.

Leave Comment

Your email address will not be published. Required fields are marked *

clear formSubmit