Research Area:  Machine Learning
Keyphrase extraction models are usually evaluated under different, not directly comparable, experimental setups. As a result, it remains unclear how well proposed models actually perform, and how they compare to each other. In this work, we address this issue by presenting a systematic large-scale analysis of state-of-the-art keyphrase extraction models involving multiple benchmark datasets from various sources and domains. Our main results reveal that state-of-the-art models are in fact still challenged by simple baselines on some datasets. We also present new insights about the impact of using author- or reader-assigned keyphrases as a proxy for gold standard, and give recommendations for strong baselines and reliable benchmark datasets.
Keywords:  
Author(s) Name:  Ygor Gallina , Florian Boudin , BĂ©atrice Daille
Journal name:  
Conferrence name:  JCDL -20: Proceedings of the ACM/IEEE Joint Conference on Digital Libraries
Publisher name:  ACM
DOI:  10.1145/3383583.3398517
Volume Information:  
Paper Link:   https://dl.acm.org/doi/abs/10.1145/3383583.3398517