Abstract
Compressive summarization systems typically rely on a seed set of syntactic rules to determine under what circumstances deleting a span is permissible, then learn which compressions to actually apply by optimizing for ROUGE. In this work, we propose to relax these explicit syntactic constraints on candidate spans, and instead leave the decision about what to delete to two data-driven criteria: plausibility and salience. Deleting a span is plausible if removing it maintains the grammaticality and factuality of a sentence, and it is salient if it removes important information from the summary. Each of these is judged by a pre-trained Transformer model, and only deletions that are both plausible and not salient can be applied. When integrated into a simple extraction-compression pipeline, our method achieves strong in-domain results on benchmark datasets, and human evaluation shows that the plausibility model generally selects for grammatical and factual deletions. Furthermore, the flexibility of our approach allows it to generalize cross-domain, and we show that our system fine-tuned on only 500 samples from a new domain can match or exceed a strong in-domain extractive model.- Anthology ID:
- 2020.emnlp-main.507
- Volume:
- Proceedings of the 2020 Conference on Empirical Methods in Natural Language Processing (EMNLP)
- Month:
- November
- Year:
- 2020
- Address:
- Online
- Editors:
- Bonnie Webber, Trevor Cohn, Yulan He, Yang Liu
- Venue:
- EMNLP
- SIG:
- Publisher:
- Association for Computational Linguistics
- Note:
- Pages:
- 6259–6274
- Language:
- URL:
- https://aclanthology.org/2020.emnlp-main.507
- DOI:
- 10.18653/v1/2020.emnlp-main.507
- Cite (ACL):
- Shrey Desai, Jiacheng Xu, and Greg Durrett. 2020. Compressive Summarization with Plausibility and Salience Modeling. In Proceedings of the 2020 Conference on Empirical Methods in Natural Language Processing (EMNLP), pages 6259–6274, Online. Association for Computational Linguistics.
- Cite (Informal):
- Compressive Summarization with Plausibility and Salience Modeling (Desai et al., EMNLP 2020)
- PDF:
- https://preview.aclanthology.org/add_acl24_videos/2020.emnlp-main.507.pdf
- Code
- shreydesai/cups
- Data
- New York Times Annotated Corpus, Sentence Compression, WikiHow