Bemba Speech Translation: Exploring a Low-Resource African Language
Muhammad Hazim Al Farouq, Aman Kassahun Wassie, Yasmin Moslem
Abstract
This paper describes our system submission to the International Conference on Spoken Language Translation (IWSLT 2025), low-resource languages track, namely for Bemba-to-English speech translation. We built cascaded speech translation systems based on Whisper and NLLB-200, and employed data augmentation techniques, such as back-translation. We investigate the effect of using synthetic data and discuss our experimental setup.- Anthology ID:
- 2025.iwslt-1.37
- Volume:
- Proceedings of the 22nd International Conference on Spoken Language Translation (IWSLT 2025)
- Month:
- July
- Year:
- 2025
- Address:
- Vienna, Austria (in-person and online)
- Editors:
- Elizabeth Salesky, Marcello Federico, Antonis Anastasopoulos
- Venues:
- IWSLT | WS
- SIG:
- Publisher:
- Association for Computational Linguistics
- Note:
- Pages:
- 354–359
- Language:
- URL:
- https://preview.aclanthology.org/acl25-workshop-ingestion/2025.iwslt-1.37/
- DOI:
- Cite (ACL):
- Muhammad Hazim Al Farouq, Aman Kassahun Wassie, and Yasmin Moslem. 2025. Bemba Speech Translation: Exploring a Low-Resource African Language. In Proceedings of the 22nd International Conference on Spoken Language Translation (IWSLT 2025), pages 354–359, Vienna, Austria (in-person and online). Association for Computational Linguistics.
- Cite (Informal):
- Bemba Speech Translation: Exploring a Low-Resource African Language (Hazim Al Farouq et al., IWSLT 2025)
- PDF:
- https://preview.aclanthology.org/acl25-workshop-ingestion/2025.iwslt-1.37.pdf