Exploring the limits of a base BART for multi-document summarization in the medical domain

Ishmael Obonyo, Silvia Casola, Horacio Saggion


Abstract
This paper is a description of our participation in the Multi-document Summarization for Literature Review (MSLR) Shared Task, in which we explore summarization models to create an automatic review of scientific results. Rather than maximizing the metrics using expensive computational models, we placed ourselves in a situation of scarce computational resources and explore the limits of a base sequence to sequence models (thus with a limited input length) to the task. Although we explore methods to feed the abstractive model with salient sentences only (using a first extractive step), we find the results still need some improvements.
Anthology ID:
2022.sdp-1.23
Volume:
Proceedings of the Third Workshop on Scholarly Document Processing
Month:
October
Year:
2022
Address:
Gyeongju, Republic of Korea
Venue:
sdp
SIG:
Publisher:
Association for Computational Linguistics
Note:
Pages:
193–198
Language:
URL:
https://aclanthology.org/2022.sdp-1.23
DOI:
Bibkey:
Cite (ACL):
Ishmael Obonyo, Silvia Casola, and Horacio Saggion. 2022. Exploring the limits of a base BART for multi-document summarization in the medical domain. In Proceedings of the Third Workshop on Scholarly Document Processing, pages 193–198, Gyeongju, Republic of Korea. Association for Computational Linguistics.
Cite (Informal):
Exploring the limits of a base BART for multi-document summarization in the medical domain (Obonyo et al., sdp 2022)
Copy Citation:
PDF:
https://preview.aclanthology.org/ingestion-script-update/2022.sdp-1.23.pdf