Research Area:  Machine Learning
Supervised learning is an important branch of machine learning (ML), which requires a complete annotation (labeling) of the involved training data. This assumption is relaxed in the settings of weakly supervised learning, where labels are allowed to be imprecise or partial. In this article, we study the setting of superset learning, in which instances are assumed to be labeled with a set of possible annotations containing the correct one. We tackle the problem of learning from such data in the context of rough set theory (RST). More specifically, we consider the problem of RST-based feature reduction as a suitable means for data disambiguation, i.e., for the purpose of figuring out the most plausible precise instantiation of the imprecise training data. To this end, we define appropriate generalizations of decision tables and reducts, using tools from generalized information theory and belief function theory. Moreover, we analyze the computational complexity and theoretical properties of the associated computational problems. Finally, we present results of a series of experiments, in which we analyze the proposed concepts empirically and compare our methods with a state-of-the-art dimensionality reduction algorithm, reporting a statistically significant improvement in predictive accuracy.
Author(s) Name:  Andrea Campagner,Davide Ciucci,Eyke Hüllermeier
Journal name:  International Journal of Approximate Reasoning
Publisher name:  Elsevier
Volume Information:  Volume 136, September 2021, Pages 150-167
Paper Link:   https://www.sciencedirect.com/science/article/abs/pii/S0888613X21000864