WR-One2Set: Towards Well-Calibrated Keyphrase Generation

Binbin Xie, Xiangpeng Wei, Baosong Yang, Huan Lin, Jun Xie, Xiaoli Wang, Min Zhang, Jinsong Su


Abstract
Keyphrase generation aims to automatically generate short phrases summarizing an input document. The recently emerged ONE2SET paradigm (Ye et al., 2021) generates keyphrases as a set and has achieved competitive performance. Nevertheless, we observe serious calibration errors outputted by ONE2SET, especially in the over-estimation of ∅ token (means “no corresponding keyphrase”). In this paper, we deeply analyze this limitation and identify two main reasons behind: 1) the parallel generation has to introduce excessive ∅ as padding tokens into training instances; and 2) the training mechanism assigning target to each slot is unstable and further aggravates the ∅ token over-estimation. To make the model well-calibrated, we propose WR-ONE2SET which extends ONE2SET with an adaptive instance-level cost Weighting strategy and a target Re-assignment mechanism. The former dynamically penalizes the over-estimated slots for different instances thus smoothing the uneven training distribution. The latter refines the original inappropriate assignment and reduces the supervisory signals of over-estimated slots. Experimental results on commonly-used datasets demonstrate the effectiveness and generality of our proposed paradigm.
Anthology ID:
2022.emnlp-main.491
Volume:
Proceedings of the 2022 Conference on Empirical Methods in Natural Language Processing
Month:
December
Year:
2022
Address:
Abu Dhabi, United Arab Emirates
Editors:
Yoav Goldberg, Zornitsa Kozareva, Yue Zhang
Venue:
EMNLP
SIG:
Publisher:
Association for Computational Linguistics
Note:
Pages:
7283–7293
Language:
URL:
https://aclanthology.org/2022.emnlp-main.491
DOI:
10.18653/v1/2022.emnlp-main.491
Bibkey:
Cite (ACL):
Binbin Xie, Xiangpeng Wei, Baosong Yang, Huan Lin, Jun Xie, Xiaoli Wang, Min Zhang, and Jinsong Su. 2022. WR-One2Set: Towards Well-Calibrated Keyphrase Generation. In Proceedings of the 2022 Conference on Empirical Methods in Natural Language Processing, pages 7283–7293, Abu Dhabi, United Arab Emirates. Association for Computational Linguistics.
Cite (Informal):
WR-One2Set: Towards Well-Calibrated Keyphrase Generation (Xie et al., EMNLP 2022)
Copy Citation:
PDF:
https://preview.aclanthology.org/add_acl24_videos/2022.emnlp-main.491.pdf