Enhancing Multi-Document Summarization with Cross-Document Graph-based Information Extraction
Zixuan Zhang, Heba Elfardy, Markus Dreyer, Kevin Small, Heng Ji, Mohit Bansal
Abstract
Information extraction (IE) and summarization are closely related, both tasked with presenting a subset of the information contained in a natural language text. However, while IE extracts structural representations, summarization aims to abstract the most salient information into a generated text summary – thus potentially encountering the technical limitations of current text generation methods (e.g., hallucination). To mitigate this risk, this work uses structured IE graphs to enhance the abstractive summarization task. Specifically, we focus on improving Multi-Document Summarization (MDS) performance by using cross-document IE output, incorporating two novel components: (1) the use of auxiliary entity and event recognition systems to focus the summary generation model; (2) incorporating an alignment loss between IE nodes and their text spans to reduce inconsistencies between the IE graphs and text representations. Operationally, both the IE nodes and corresponding text spans are projected into the same embedding space and pairwise distance is minimized. Experimental results on multiple MDS benchmarks show that summaries generated by our model are more factually consistent with the source documents than baseline models while maintaining the same level of abstractiveness.- Anthology ID:
- 2023.eacl-main.124
- Volume:
- Proceedings of the 17th Conference of the European Chapter of the Association for Computational Linguistics
- Month:
- May
- Year:
- 2023
- Address:
- Dubrovnik, Croatia
- Editors:
- Andreas Vlachos, Isabelle Augenstein
- Venue:
- EACL
- SIG:
- Publisher:
- Association for Computational Linguistics
- Note:
- Pages:
- 1696–1707
- Language:
- URL:
- https://preview.aclanthology.org/add_missing_videos/2023.eacl-main.124/
- DOI:
- 10.18653/v1/2023.eacl-main.124
- Cite (ACL):
- Zixuan Zhang, Heba Elfardy, Markus Dreyer, Kevin Small, Heng Ji, and Mohit Bansal. 2023. Enhancing Multi-Document Summarization with Cross-Document Graph-based Information Extraction. In Proceedings of the 17th Conference of the European Chapter of the Association for Computational Linguistics, pages 1696–1707, Dubrovnik, Croatia. Association for Computational Linguistics.
- Cite (Informal):
- Enhancing Multi-Document Summarization with Cross-Document Graph-based Information Extraction (Zhang et al., EACL 2023)
- PDF:
- https://preview.aclanthology.org/add_missing_videos/2023.eacl-main.124.pdf