Research Breakthrough Possible @S-Logix

Office Address

Social List

A multiclass classification approach for incremental entity resolution on short textual data - 2021

A Multiclass Classification Approach For Incremental Entity Resolution On Short Textual Data

Research Area:  Machine Learning


Several web applications maintain data repositories containing references to thousands of real-world entities originating from multiple sources, and they continually receive new data. Identifying the distinct entities and associating the correct references to each one is a problem known as entity resolution. The challenge is to solve the problem incrementally, as the data arrive, especially when those data are described by a single textual attribute. In this paper, we propose a new approach for incremental entity resolution. The method we have implemented, called AssocIER, uses an ensemble of multiclass classifiers with self-training and detection of novel classes. We have evaluated our method in various real-world datasets and scenarios, comparing it with a traditional entity resolution approach. The results show that AssocIER is effective and efficient to solve unstructured data in collections with a large number of entities and features, and is able to detect hundreds of novel classes.


Author(s) Name:  João Antonio Silva and Denilson Alves Pereira

Journal name:  International Journal of Business Intelligence and Data Mining

Conferrence name:  

Publisher name:  Inderscience

DOI:  10.1504/IJBIDM.2021.112988

Volume Information:  Vol. 18, No. 2,pp 218-245