Amazing technological breakthrough possible @S-Logix pro@slogix.in

Office Address

  • #5, First Floor, 4th Street Dr. Subbarayan Nagar Kodambakkam, Chennai-600 024 Landmark : Samiyar Madam
  • pro@slogix.in
  • +91- 81240 01111

Social List

A Domain Independent Double Layered Approach to Keyphrase Generation - 2014

A Domain Independent Double Layered Approach To Keyphrase Generation

Research Area:  Machine Learning

Abstract:

The annotation of documents and web pages with semantic metatdata is an activity that can greatly increase the accuracy of Information Retrieval and Personalization systems, but the growing amount of text data available is too large for an extensive manual process. On the other hand, automatic keyphrase generation, a complex task involving Natural Language Processing and Knowledge Engineering, can significantly support this activity. Several different strategies have been proposed over the years, but most of them require extensive training data, which are not always available, suffer high ambiguity and differences in writing style, are highly domain-specific, and often rely on a well-structured knowledge that is very hard to acquire and encode. In order to overcome these limitations, we propose in this paper an innovative domain-independent approach that consists of an unsupervised keyphrase extraction phase and a subsequent keyphrase inference phase based on loosely structured, coll aborative knowledge such as Wikipedia, Wordnik, and Urban Dictionary. This double layered approach allows us to generate keyphrases that both describe and classify the text.

Keywords:  

Author(s) Name:  Dario De Nart and Carlo Tasso

Journal name:  

Conferrence name:  Proceedings of the 10th International Conference on Web Information Systems and Technologies - Volume 1: WEBIST

Publisher name:  SCITEPRESS

DOI:  10.5220/0004855303050312

Volume Information: