Reassessing Speech Translation for Low-Resource Languages: Do LLMs Redefine the State-of-the-Art Against Cascaded Models?

Jonah Dauvet, Min Ma, Jessica Ojo, David Ifeoluwa Adelani


Abstract
Automatic speech translation (AST) promotes seamless communication among speakers of different languages. While current state-of-the-art models excel with high-resource languages, their performance on low-resource languages (LRLs) is not well-established. We investigate this by evaluating state-of-the-art models on 10 LRLs with varying data amounts (10-30+ hours). Through six finetuning strategies and experimenting with three main AST paradigms, we observe that: (1) The latest Large Language Models (LLMs) may struggle with LRLs. (2) Comprehensive experiments suggest that for LRLs, more AST finetuning data is not always beneficial. (3) Our 2-Stage with ASR corrector finetuning recipe can substantially improve AST performance on LRLs, achieving up to a 5.8x BLEU score boost on translating related languages to English, while on par with the best monolingual finetuning in BLEU score when translating the target language to English. (4) We share effective engineering practices, including how to effectively adapt AST models to unseen languages.
Anthology ID:
2025.mrl-main.11
Volume:
Proceedings of the 5th Workshop on Multilingual Representation Learning (MRL 2025)
Month:
November
Year:
2025
Address:
Suzhuo, China
Editors:
David Ifeoluwa Adelani, Catherine Arnett, Duygu Ataman, Tyler A. Chang, Hila Gonen, Rahul Raja, Fabian Schmidt, David Stap, Jiayi Wang
Venues:
MRL | WS
SIG:
Publisher:
Association for Computational Linguistics
Note:
Pages:
149–160
Language:
URL:
https://preview.aclanthology.org/ingest-emnlp/2025.mrl-main.11/
DOI:
Bibkey:
Cite (ACL):
Jonah Dauvet, Min Ma, Jessica Ojo, and David Ifeoluwa Adelani. 2025. Reassessing Speech Translation for Low-Resource Languages: Do LLMs Redefine the State-of-the-Art Against Cascaded Models?. In Proceedings of the 5th Workshop on Multilingual Representation Learning (MRL 2025), pages 149–160, Suzhuo, China. Association for Computational Linguistics.
Cite (Informal):
Reassessing Speech Translation for Low-Resource Languages: Do LLMs Redefine the State-of-the-Art Against Cascaded Models? (Dauvet et al., MRL 2025)
Copy Citation:
PDF:
https://preview.aclanthology.org/ingest-emnlp/2025.mrl-main.11.pdf