One2Set: Generating Diverse Keyphrases as a Set

Jiacheng Ye, Tao Gui, Yichao Luo, Yige Xu, Qi Zhang


Abstract
Recently, the sequence-to-sequence models have made remarkable progress on the task of keyphrase generation (KG) by concatenating multiple keyphrases in a predefined order as a target sequence during training. However, the keyphrases are inherently an unordered set rather than an ordered sequence. Imposing a predefined order will introduce wrong bias during training, which can highly penalize shifts in the order between keyphrases. In this work, we propose a new training paradigm One2Set without predefining an order to concatenate the keyphrases. To fit this paradigm, we propose a novel model that utilizes a fixed set of learned control codes as conditions to generate a set of keyphrases in parallel. To solve the problem that there is no correspondence between each prediction and target during training, we propose a K-step label assignment mechanism via bipartite matching, which greatly increases the diversity and reduces the repetition rate of generated keyphrases. The experimental results on multiple benchmarks demonstrate that our approach significantly outperforms the state-of-the-art methods.
Anthology ID:
2021.acl-long.354
Volume:
Proceedings of the 59th Annual Meeting of the Association for Computational Linguistics and the 11th International Joint Conference on Natural Language Processing (Volume 1: Long Papers)
Month:
August
Year:
2021
Address:
Online
Venues:
ACL | IJCNLP
SIG:
Publisher:
Association for Computational Linguistics
Note:
Pages:
4598–4608
Language:
URL:
https://aclanthology.org/2021.acl-long.354
DOI:
10.18653/v1/2021.acl-long.354
Bibkey:
Cite (ACL):
Jiacheng Ye, Tao Gui, Yichao Luo, Yige Xu, and Qi Zhang. 2021. One2Set: Generating Diverse Keyphrases as a Set. In Proceedings of the 59th Annual Meeting of the Association for Computational Linguistics and the 11th International Joint Conference on Natural Language Processing (Volume 1: Long Papers), pages 4598–4608, Online. Association for Computational Linguistics.
Cite (Informal):
One2Set: Generating Diverse Keyphrases as a Set (Ye et al., ACL-IJCNLP 2021)
Copy Citation:
PDF:
https://preview.aclanthology.org/ingestion-script-update/2021.acl-long.354.pdf
Video:
 https://preview.aclanthology.org/ingestion-script-update/2021.acl-long.354.mp4
Code
 jiacheng-ye/kg_one2set
Data
KP20k