Exploring Transliteration-Based Zero-Shot Transfer for Amharic ASR

Hellina Hailu Nigatu, Hanan Aldarmaki


Abstract
The performance of Automatic Speech Recognition (ASR) depends on the availability of transcribed speech datasets—often scarce ornon-existent for many of the worlds languages. This study investigates alternative strategies to bridge the data gap using zero-shot cross-lingual transfer, leveraging transliteration as a method to utilize data from other languages. We experiment with transliteration from various source languages and demonstrate ASR performance in a low-resourced language, Amharic. We find that source data that align with the character distribution of the test data achieves the best performance, regardless of language family. We also experiment with fine-tuning with minimal transcribed data in the target language. Our findings demonstrate that transliteration, particularly when combined with a strategic choice of source languages, is a viable approach for improving ASR in zero-shot and low-resourced settings.
Anthology ID:
2025.africanlp-1.10
Volume:
Proceedings of the Sixth Workshop on African Natural Language Processing (AfricaNLP 2025)
Month:
July
Year:
2025
Address:
Vienna, Austria
Editors:
Constantine Lignos, Idris Abdulmumin, David Adelani
Venues:
AfricaNLP | WS
SIG:
Publisher:
Association for Computational Linguistics
Note:
Pages:
64–73
Language:
URL:
https://preview.aclanthology.org/landing_page/2025.africanlp-1.10/
DOI:
10.18653/v1/2025.africanlp-1.10
Bibkey:
Cite (ACL):
Hellina Hailu Nigatu and Hanan Aldarmaki. 2025. Exploring Transliteration-Based Zero-Shot Transfer for Amharic ASR. In Proceedings of the Sixth Workshop on African Natural Language Processing (AfricaNLP 2025), pages 64–73, Vienna, Austria. Association for Computational Linguistics.
Cite (Informal):
Exploring Transliteration-Based Zero-Shot Transfer for Amharic ASR (Nigatu & Aldarmaki, AfricaNLP 2025)
Copy Citation:
PDF:
https://preview.aclanthology.org/landing_page/2025.africanlp-1.10.pdf