Research Area:  Machine Learning
Data reduction processes are designed not only to reduce the amount of data, but also to reduce noise interference. In this study, we focus on researching sample reduction algorithms for the classification and regression data. A sample quality evaluation measure denoted by NN-kNN, which is inspired by human social behavior, is proposed. This measure is a local evaluation method that can accurately evaluate the quality of samples under uneven and irregular data distribution. Additionally, the measure is easy to understand and applies to both supervised and unsupervised data. Consequently, it respectively studies the sample reduction algorithms based on the NN-kNN measure for classification and regression data. Experiments are carried out to verify the proposed quality evaluation measure and data reduction algorithms. Experimental results show that NN-kNN can evaluate data quality effectively. High quality samples selected by the reduction algorithms can generate high classification and prediction performance. Furthermore, the robustness of the sample reduction algorithms is also validated.
Keywords:  
Data Reduction
Nn-KNn
Classification And Regression
Machine Learning
Deep Learning
Author(s) Name:  Shuang An, Qinghua Hu, Changzhong Wang, Ge Guo & Piyu Li
Journal name:  International Journal of Machine Learning and Cybernetics
Conferrence name:  
Publisher name:  Springer
DOI:  10.1007/s13042-021-01327-3
Volume Information:  2021
Paper Link:   https://link.springer.com/article/10.1007%2Fs13042-021-01327-3