TripleRank: An unsupervised keyphrase extraction algorithm

TripleRank: An unsupervised keyphrase extraction algorithm - 2021

Research Paper on unsupervised keyphrase extraction algorithm

Research Area: Machine Learning

Abstract:

Automatic keyphrase extraction algorithms aim to identify words and phrases that contain the core information in documents. As online scholarly resources have become widespread in recent years, better keyphrase extraction techniques are required to improve search efficiency. We present two features, keyphrase semantic diversity and keyphrase coverage, to overcome limitations of existing methods for unsupervised keyphrase extraction. Keyphrase semantic diversity is the degree of semantic variety in the extraction result, which is introduced to avoid extracting synonym phrases that contain the same high-score candidate. Keyphrase coverage refers to candidates’ representativeness of other words in documents. We propose an unsupervised keyphrase extraction method called TripleRank, which evaluates three features: word position (a sensitive feature for academic documents) and two innovative features mentioned above. The architecture of TripleRank includes three sub-models that score the three features and a summing model. Though involving multiple models, there is no typical iteration process in TripleRank; hence, the computational cost is relatively low. TripleRank has led the experiment results on four academic datasets compared to four state-of-the-art baseline models, which confirmed the influence of keyphrase semantic diversity and keyphrase coverage and proved the efficiency of our method.

Keywords:
Automatic keyphrase extraction
TripleRank
unsupervised
Machine Learning
Deep Learning

Author(s) Name: Tuohang Li, Liang Hu, Hongtu Li, Chengyu Sun, Shuai Li, Ling Chi

Journal name: Knowledge-Based Systems

Conferrence name:

Publisher name: Elsevier

DOI: 10.1016/j.knosys.2021.106846

Volume Information: Volume 219, 11 May 2021, 106846

Paper Link: https://www.sciencedirect.com/science/article/abs/pii/S095070512100109X

Office Address

Social List