Translation-Augmented Multilingual Summarization for Low-Resource Languages

Prasanth


Abstract
While automatic text summarization has achieved remarkable success in English,extending these capabilities to low-resource languages remains a significantchallenge due to the scarcity of labeled training data. We propose atranslation-augmented approach to multilingual summarization: we systematicallytranslate high-quality English summarization corpora into low-resource targetlanguages using NLLB-200, and use the resulting parallel data to train andevaluate sequence-to-sequence models. We experiment across three typologicallydiverse languages—Swahili, Hausa, and Afrikaans—comparing monolingualfine-tuning (MONO), cross-lingual transfer (XLT), and joint multilingualtraining (TAMT) on mBART-large-50. Monolingual fine-tuning achieves the bestperformance for Swahili (ROUGE-L 13.9) and Afrikaans (ROUGE-L 15.7),surpassing the Lead-3 baseline in both cases, while cross-lingual transferremains strongest for Hausa (ROUGE-L 14.5). We show that native language tokenavailability in mBART-50 is a critical determinant of fine-tuning performance,and characterize the conditions under which the theoretically expectedTAMT > MONO > XLT ordering breaks down. We release our dataset, code, andevaluation infrastructure to support future research on low-resourcemultilingual summarization.
Anthology ID:
2026.ltedi-1.10
Volume:
Proceedings of the Sixth Workshop on Language Technology for Equality, Diversity, Inclusion
Month:
July
Year:
2026
Address:
Virtual (Online)
Editors:
Bharathi Raja Chakravarthi, Bharathi B, Paul Buitelaar, Durairaj Thenmozhi, Miguel Ángel García Cumbreras, Salud María Jiménez Zafra
Venues:
LTEDI | WS
SIG:
Publisher:
Association for Computational Linguistics
Note:
Pages:
108–117
Language:
URL:
https://preview.aclanthology.org/ingest-acl-workshops/2026.ltedi-1.10/
DOI:
Bibkey:
Cite (ACL):
Prasanth. 2026. Translation-Augmented Multilingual Summarization for Low-Resource Languages. In Proceedings of the Sixth Workshop on Language Technology for Equality, Diversity, Inclusion, pages 108–117, Virtual (Online). Association for Computational Linguistics.
Cite (Informal):
Translation-Augmented Multilingual Summarization for Low-Resource Languages (Prasanth, LTEDI 2026)
Copy Citation:
PDF:
https://preview.aclanthology.org/ingest-acl-workshops/2026.ltedi-1.10.pdf