StructSum: Summarization via Structured Representations
Vidhisha Balachandran, Artidoro Pagnoni, Jay Yoon Lee, Dheeraj Rajagopal, Jaime Carbonell, Yulia Tsvetkov
Abstract
Abstractive text summarization aims at compressing the information of a long source document into a rephrased, condensed summary. Despite advances in modeling techniques, abstractive summarization models still suffer from several key challenges: (i) layout bias: they overfit to the style of training corpora; (ii) limited abstractiveness: they are optimized to copying n-grams from the source rather than generating novel abstractive summaries; (iii) lack of transparency: they are not interpretable. In this work, we propose a framework based on document-level structure induction for summarization to address these challenges. To this end, we propose incorporating latent and explicit dependencies across sentences in the source document into end-to-end single-document summarization models. Our framework complements standard encoder-decoder summarization models by augmenting them with rich structure-aware document representations based on implicitly learned (latent) structures and externally-derived linguistic (explicit) structures. We show that our summarization framework, trained on the CNN/DM dataset, improves the coverage of content in the source documents, generates more abstractive summaries by generating more novel n-grams, and incorporates interpretable sentence-level structures, while performing on par with standard baselines.- Anthology ID:
- 2021.eacl-main.220
- Volume:
- Proceedings of the 16th Conference of the European Chapter of the Association for Computational Linguistics: Main Volume
- Month:
- April
- Year:
- 2021
- Address:
- Online
- Editors:
- Paola Merlo, Jorg Tiedemann, Reut Tsarfaty
- Venue:
- EACL
- SIG:
- Publisher:
- Association for Computational Linguistics
- Note:
- Pages:
- 2575–2585
- Language:
- URL:
- https://aclanthology.org/2021.eacl-main.220
- DOI:
- 10.18653/v1/2021.eacl-main.220
- Cite (ACL):
- Vidhisha Balachandran, Artidoro Pagnoni, Jay Yoon Lee, Dheeraj Rajagopal, Jaime Carbonell, and Yulia Tsvetkov. 2021. StructSum: Summarization via Structured Representations. In Proceedings of the 16th Conference of the European Chapter of the Association for Computational Linguistics: Main Volume, pages 2575–2585, Online. Association for Computational Linguistics.
- Cite (Informal):
- StructSum: Summarization via Structured Representations (Balachandran et al., EACL 2021)
- PDF:
- https://preview.aclanthology.org/nschneid-patch-1/2021.eacl-main.220.pdf
- Code
- vidhishanair/structured_summarizer
- Data
- CNN/Daily Mail