Amazing technological breakthrough possible @S-Logix pro@slogix.in

Office Address

  • #5, First Floor, 4th Street Dr. Subbarayan Nagar Kodambakkam, Chennai-600 024 Landmark : Samiyar Madam
  • pro@slogix.in
  • +91- 81240 01111

Social List

Automatic keyphrase extraction using word embeddings - 2020

Automatic keyphrase extraction using word embeddings

Research paper on Automatic keyphrase extraction using word embeddings

Research Area:  Machine Learning

Abstract:

Unsupervised random-walk keyphrase extraction models mainly rely on global structural information of the word graph, with nodes representing candidate words and edges capturing the co-occurrence information between candidate words. However, using word embedding method to integrate multiple kinds of useful information into the random-walk model to help better extract keyphrases is relatively unexplored. In this paper, we propose a random-walk-based ranking method to extract keyphrases from text documents using word embeddings. Specifically, we first design a heterogeneous text graph embedding model to integrate local context information of the word graph (i.e., the local word collocation patterns) with some crucial features of candidate words and edges of the word graph. Then, a novel random-walk-based ranking model is designed to score candidate words by leveraging such learned word embeddings. Finally, a new and generic similarity-based phrase scoring model using word embeddings is proposed to score phrases for selecting top-scoring phrases as keyphrases. Experimental results show that the proposed method consistently outperforms eight state-of-the-art unsupervised methods on three real datasets for keyphrase extraction.

Keywords:  
Keyphrase extraction
Random-walk-based keyphrase extraction model
Word embedding
Phrase scoring model
Machine Learning
Deep Learning

Author(s) Name:  Yuxiang Zhang, Huan Liu, Suge Wang, W. H. Ip., Wei Fan & Chunjing Xiao

Journal name:  Soft Computing

Conferrence name:  

Publisher name:  Springer

DOI:  10.1007/s00500-019-03963-y

Volume Information:  volume 24, pages: 5593–5608 (2020)