An Analysis of Datasets, Metrics and Models in Keyphrase Generation

Florian Boudin, Akiko Aizawa


Abstract
Keyphrase generation refers to the task of producing a set of words or phrases that summarises the content of a document. Continuous efforts have been dedicated to this task over the past few years, spreading across multiple lines of research, such as model architectures, data resources, and use-case scenarios. Yet, the current state of keyphrase generation remains unknown as there has been no attempt to review and analyse previous work. In this paper, we bridge this gap by presenting an analysis of over 50 research papers on keyphrase generation, offering a comprehensive overview of recent progress, limitations, and open challenges. Our findings highlight several critical issues in current evaluation practices, such as the concerning similarity among commonly-used benchmark datasets and inconsistencies in metric calculations leading to overestimated performances. Additionally, we address the limited availability of pre-trained models by releasing a strong PLM-based model for keyphrase generation as an effort to facilitate future research.
Anthology ID:
2025.gem-1.76
Volume:
Proceedings of the Fourth Workshop on Generation, Evaluation and Metrics (GEM²)
Month:
July
Year:
2025
Address:
Vienna, Austria and virtual meeting
Editors:
Kaustubh Dhole, Miruna Clinciu
Venues:
GEM | WS
SIG:
Publisher:
Association for Computational Linguistics
Note:
Pages:
973
Language:
URL:
https://preview.aclanthology.org/transition-to-people-yaml/2025.gem-1.76/
DOI:
Bibkey:
Cite (ACL):
Florian Boudin and Akiko Aizawa. 2025. An Analysis of Datasets, Metrics and Models in Keyphrase Generation. In Proceedings of the Fourth Workshop on Generation, Evaluation and Metrics (GEM²), pages 973–973, Vienna, Austria and virtual meeting. Association for Computational Linguistics.
Cite (Informal):
An Analysis of Datasets, Metrics and Models in Keyphrase Generation (Boudin & Aizawa, GEM 2025)
Copy Citation:
PDF:
https://preview.aclanthology.org/transition-to-people-yaml/2025.gem-1.76.pdf