Large-Scale Evaluation of Keyphrase Extraction Models

Research Area: Machine Learning

Abstract:

Keyphrase extraction models are usually evaluated under different, not directly comparable, experimental setups. As a result, it remains unclear how well proposed models actually perform, and how they compare to each other. In this work, we address this issue by presenting a systematic large-scale analysis of state-of-the-art keyphrase extraction models involving multiple benchmark datasets from various sources and domains. Our main results reveal that state-of-the-art models are in fact still challenged by simple baselines on some datasets. We also present new insights about the impact of using author- or reader-assigned keyphrases as a proxy for gold standard, and give recommendations for strong baselines and reliable benchmark datasets.

Keywords:

Author(s) Name: Ygor Gallina , Florian Boudin , Béatrice Daille

Journal name:

Conferrence name: JCDL -20: Proceedings of the ACM/IEEE Joint Conference on Digital Libraries

Publisher name: ACM

DOI: 10.1145/3383583.3398517

Volume Information:

Paper Link: https://dl.acm.org/doi/abs/10.1145/3383583.3398517

Office Address

Social List