Abstract
Abstractive summarization systems aim to write concise summaries capturing the most essential information of the input document in their own words. One of the ways to achieve this is to gather and combine multiple pieces of information from the source document, a process we call aggregation. Despite its importance, the extent to which both reference summaries in benchmark datasets and system-generated summaries require aggregation is yet unknown. In this work, we propose AggSHAP, a measure of the degree of aggregation in a summary sentence. We show that AggSHAP distinguishes multi-sentence aggregation from single-sentence extraction or paraphrasing through automatic and human evaluations. We find that few reference or model-generated summary sentences have a high degree of aggregation measured by the proposed metric. We also demonstrate negative correlations between AggSHAP and other quality scores of system summaries. These findings suggest the need to develop new tasks and datasets to encourage multi-sentence aggregation in summarization.- Anthology ID:
- 2023.newsum-1.12
- Volume:
- Proceedings of the 4th New Frontiers in Summarization Workshop
- Month:
- December
- Year:
- 2023
- Address:
- Singapore
- Editors:
- Yue Dong, Wen Xiao, Lu Wang, Fei Liu, Giuseppe Carenini
- Venue:
- NewSum
- SIG:
- Publisher:
- Association for Computational Linguistics
- Note:
- Pages:
- 121–134
- Language:
- URL:
- https://aclanthology.org/2023.newsum-1.12
- DOI:
- 10.18653/v1/2023.newsum-1.12
- Cite (ACL):
- Jingyi He, Meng Cao, and Jackie Chi Kit Cheung. 2023. Analyzing Multi-Sentence Aggregation in Abstractive Summarization via the Shapley Value. In Proceedings of the 4th New Frontiers in Summarization Workshop, pages 121–134, Singapore. Association for Computational Linguistics.
- Cite (Informal):
- Analyzing Multi-Sentence Aggregation in Abstractive Summarization via the Shapley Value (He et al., NewSum 2023)
- PDF:
- https://preview.aclanthology.org/proper-vol2-ingestion/2023.newsum-1.12.pdf