Supervising the Centroid Baseline for Extractive Multi-Document Summarization

Simão Gonçalves, Gonçalo Correia, Diogo Pernes, Afonso Mendes


Abstract
The centroid method is a simple approach for extractive multi-document summarization and many improvements to its pipeline have been proposed. We further refine it by adding a beam search process to the sentence selection and also a centroid estimation attention model that leads to improved results. We demonstrate this in several multi-document summarization datasets, including in a multilingual scenario.
Anthology ID:
2023.newsum-1.9
Volume:
Proceedings of the 4th New Frontiers in Summarization Workshop
Month:
December
Year:
2023
Address:
Singapore
Editors:
Yue Dong, Wen Xiao, Lu Wang, Fei Liu, Giuseppe Carenini
Venue:
NewSum
SIG:
Publisher:
Association for Computational Linguistics
Note:
Pages:
87–96
Language:
URL:
https://aclanthology.org/2023.newsum-1.9
DOI:
10.18653/v1/2023.newsum-1.9
Bibkey:
Cite (ACL):
Simão Gonçalves, Gonçalo Correia, Diogo Pernes, and Afonso Mendes. 2023. Supervising the Centroid Baseline for Extractive Multi-Document Summarization. In Proceedings of the 4th New Frontiers in Summarization Workshop, pages 87–96, Singapore. Association for Computational Linguistics.
Cite (Informal):
Supervising the Centroid Baseline for Extractive Multi-Document Summarization (Gonçalves et al., NewSum 2023)
Copy Citation:
PDF:
https://preview.aclanthology.org/nschneid-patch-5/2023.newsum-1.9.pdf
Supplementary material:
 2023.newsum-1.9.SupplementaryMaterial.txt