Research Area:  Machine Learning
Few online classification algorithms based on traditional inductive ensembling, such as online bagging or boosting, focus on handling concept drifting data streams while performing well on noisy data. Motivated by this, an incremental algorithm based on Ensemble Decision Trees for Concept-drifting data streams (EDTC) is proposed in this paper. Three variants of random feature selection are introduced to implement split-tests and two thresholds specified in Hoeffding Bounds inequality are utilized to distinguish concept drifts from noisy data. Extensive studies on synthetic and real streaming databases demonstrate that our algorithm of EDTC performs very well compared to several known online algorithms based on single models and ensemble models. A conclusion is hence drawn that multiple solutions are provided for learning from concept drifting data streams under noise.
Keywords:  
Author(s) Name:  Peipei Li, Xindong Wu, Xuegang Hu, Hao Wang
Journal name:  Neurocomputing
Conferrence name:  
Publisher name:  ELSEVIER
DOI:  10.1016/j.neucom.2015.04.024
Volume Information:  Volume 166, 20 October 2015, Pages 68-83
Paper Link:   https://www.sciencedirect.com/science/article/abs/pii/S0925231215004713