Research Area:  Machine Learning
Intrusion detection technology has received increasing attention in recent years. Many researchers have proposed various intrusion detection systems using machine learning (ML) methods. However, there are two noteworthy factors affecting the robustness of the model. One is the severe imbalance of network traffic in different categories and the other is the nonidentical distribution between training set and test set in feature space. This paper presents a multilevel intrusion detection model framework named multilevel semi-supervised ML (MSML) to address these issues. The MSML framework includes four modules: 1) pure cluster extraction; 2) pattern discovery; 3) fine-grained classification (FC); and 4) model updating. In the pure cluster module, we introduce an concept of “pure cluster” and propose a hierarchical semi-supervised k-means algorithm with an aim to find out all the pure clusters. In the pattern discovery module, we define the “unknown pattern” and apply cluster-based method aiming to find those unknown patterns. Then a test sample is sentenced to labeled known pattern or unlabeled unknown pattern. The FC module can achieves FC for those unknown pattern samples. The model updating module provides a mechanism for retraining. KDDCUP99 dataset is applied to evaluate MSML. Experimental results show that MSML is superior to other existing intrusion detection models in terms of overall accuracy, F1-score, and unknown pattern recognition capability.
Author(s) Name:  Haipeng Yao; Danyang Fu; Peiying Zhang; Maozhen Li and Yunjie Liu
Journal name:   IEEE Internet of Things Journal
Publisher name:  IEEE
Volume Information:  Volume: 6, Issue: 2, April 2019) Page(s): 1949 - 1959
Paper Link:   https://ieeexplore.ieee.org/document/8477001