Abstract
This work presents a new approach to unsupervised abstractive summarization based on maximizing a combination of coverage and fluency for a given length constraint. It introduces a novel method that encourages the inclusion of key terms from the original document into the summary: key terms are masked out of the original document and must be filled in by a coverage model using the current generated summary. A novel unsupervised training procedure leverages this coverage model along with a fluency model to generate and score summaries. When tested on popular news summarization datasets, the method outperforms previous unsupervised methods by more than 2 R-1 points, and approaches results of competitive supervised methods. Our model attains higher levels of abstraction with copied passages roughly two times shorter than prior work, and learns to compress and merge sentences without supervision.- Anthology ID:
- 2020.acl-main.460
- Volume:
- Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics
- Month:
- July
- Year:
- 2020
- Address:
- Online
- Editors:
- Dan Jurafsky, Joyce Chai, Natalie Schluter, Joel Tetreault
- Venue:
- ACL
- SIG:
- Publisher:
- Association for Computational Linguistics
- Note:
- Pages:
- 5135–5150
- Language:
- URL:
- https://aclanthology.org/2020.acl-main.460
- DOI:
- 10.18653/v1/2020.acl-main.460
- Cite (ACL):
- Philippe Laban, Andrew Hsi, John Canny, and Marti A. Hearst. 2020. The Summary Loop: Learning to Write Abstractive Summaries Without Examples. In Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics, pages 5135–5150, Online. Association for Computational Linguistics.
- Cite (Informal):
- The Summary Loop: Learning to Write Abstractive Summaries Without Examples (Laban et al., ACL 2020)
- PDF:
- https://preview.aclanthology.org/nschneid-patch-4/2020.acl-main.460.pdf
- Code
- cannylab/summary_loop
- Data
- CNN/Daily Mail, NEWSROOM