Mohammed Ishaaq Datay


2025

pdf bib
Benchmarking IsiXhosa Automatic Speech Recognition and Machine Translation for Digital Health Provision
Abby Blocker | Francois Meyer | Ahmed Biyabani | Joyce Mwangama | Mohammed Ishaaq Datay | Bessie Malila
Proceedings of the Second Workshop on Patient-Oriented Language Processing (CL4Health)

As digital health becomes more ubiquitous, people from different geographic regions are connected and there is thus a need for accurate language translation services. South Africa presents opportunity and need for digital health innovation, but implementing indigenous translation systems for digital health is difficult due to a lack of language resources. Understanding the accuracy of current models for use in medical translation of indigenous languages is crucial for designers looking to build quality digital health solutions. This paper presents a new dataset with audio and text of primary health consultations for automatic speech recognition and machine translation in South African English and the indigenous South African language of isiXhosa. We then evaluate the performance of well-established pretrained models on this dataset. We found that isiXhosa had limited support in speech recognition models and showed high, variable character error rates for transcription (26-70%). For translation tasks, Google Cloud Translate and ChatGPT outperformed the other evaluated models, indicating large language models can have similar performance to dedicated machine translation models for low-resource language translation.