Diverse, Controllable, and Keyphrase-Aware: A Corpus and Method for News Multi-Headline Generation
Dayiheng Liu, Yeyun Gong, Yu Yan, Jie Fu, Bo Shao, Daxin Jiang, Jiancheng Lv, Nan Duan
Abstract
News headline generation aims to produce a short sentence to attract readers to read the news. One news article often contains multiple keyphrases that are of interest to different users, which can naturally have multiple reasonable headlines. However, most existing methods focus on the single headline generation. In this paper, we propose generating multiple headlines with keyphrases of user interests, whose main idea is to generate multiple keyphrases of interest to users for the news first, and then generate multiple keyphrase-relevant headlines. We propose a multi-source Transformer decoder, which takes three sources as inputs: (a) keyphrase, (b) keyphrase-filtered article, and (c) original article to generate keyphrase-relevant, high-quality, and diverse headlines. Furthermore, we propose a simple and effective method to mine the keyphrases of interest in the news article and build a first large-scale keyphrase-aware news headline corpus, which contains over 180K aligned triples of <news article, headline, keyphrase>. Extensive experimental comparisons on the real-world dataset show that the proposed method achieves state-of-the-art results in terms of quality and diversity.- Anthology ID:
- 2020.emnlp-main.505
- Volume:
- Proceedings of the 2020 Conference on Empirical Methods in Natural Language Processing (EMNLP)
- Month:
- November
- Year:
- 2020
- Address:
- Online
- Editors:
- Bonnie Webber, Trevor Cohn, Yulan He, Yang Liu
- Venue:
- EMNLP
- SIG:
- Publisher:
- Association for Computational Linguistics
- Note:
- Pages:
- 6241–6250
- Language:
- URL:
- https://aclanthology.org/2020.emnlp-main.505
- DOI:
- 10.18653/v1/2020.emnlp-main.505
- Cite (ACL):
- Dayiheng Liu, Yeyun Gong, Yu Yan, Jie Fu, Bo Shao, Daxin Jiang, Jiancheng Lv, and Nan Duan. 2020. Diverse, Controllable, and Keyphrase-Aware: A Corpus and Method for News Multi-Headline Generation. In Proceedings of the 2020 Conference on Empirical Methods in Natural Language Processing (EMNLP), pages 6241–6250, Online. Association for Computational Linguistics.
- Cite (Informal):
- Diverse, Controllable, and Keyphrase-Aware: A Corpus and Method for News Multi-Headline Generation (Liu et al., EMNLP 2020)
- PDF:
- https://preview.aclanthology.org/alta-23-ingestion/2020.emnlp-main.505.pdf
- Code
- dayihengliu/KeyMultiHeadline