Abstract
How can we effectively inform content selection in Transformer-based abstractive summarization models? In this work, we present a simple-yet-effective attention head masking technique, which is applied on encoder-decoder attentions to pinpoint salient content at inference time. Using attention head masking, we are able to reveal the relation between encoder-decoder attentions and content selection behaviors of summarization models. We then demonstrate its effectiveness on three document summarization datasets based on both in-domain and cross-domain settings. Importantly, our models outperform prior state-of-the-art models on CNN/Daily Mail and New York Times datasets. Moreover, our inference-time masking technique is also data-efficient, requiring only 20% of the training samples to outperform BART fine-tuned on the full CNN/DailyMail dataset.- Anthology ID:
- 2021.naacl-main.397
- Volume:
- Proceedings of the 2021 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies
- Month:
- June
- Year:
- 2021
- Address:
- Online
- Editors:
- Kristina Toutanova, Anna Rumshisky, Luke Zettlemoyer, Dilek Hakkani-Tur, Iz Beltagy, Steven Bethard, Ryan Cotterell, Tanmoy Chakraborty, Yichao Zhou
- Venue:
- NAACL
- SIG:
- Publisher:
- Association for Computational Linguistics
- Note:
- Pages:
- 5008–5016
- Language:
- URL:
- https://aclanthology.org/2021.naacl-main.397
- DOI:
- 10.18653/v1/2021.naacl-main.397
- Cite (ACL):
- Shuyang Cao and Lu Wang. 2021. Attention Head Masking for Inference Time Content Selection in Abstractive Summarization. In Proceedings of the 2021 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, pages 5008–5016, Online. Association for Computational Linguistics.
- Cite (Informal):
- Attention Head Masking for Inference Time Content Selection in Abstractive Summarization (Cao & Wang, NAACL 2021)
- PDF:
- https://preview.aclanthology.org/nschneid-patch-5/2021.naacl-main.397.pdf
- Data
- New York Times Annotated Corpus