Negasi Haile Abadi
2026
Afri-MCQA: Multimodal Cultural Question Answering for African Languages
Atnafu Lambebo Tonja | Srija Anand | Emilio Villa-Cueva | Israel Abebe Azime | Jesujoba Oluwadara Alabi | Muhidin A. Mohamed | Debela Desalegn Yadeta | Negasi Haile Abadi | Abigail Oppong | Nnaemeka Casmir Obiefuna | Idris Abdulmumin | Naome A Etori | Eric Peter Wairagala | Kanda Patrick Tshinu | Imanigirimbabazi Emmanuel | Gabofetswe Malema | Alham Fikri Aji | David Ifeoluwa Adelani | Thamar Solorio
Proceedings of the 64th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers)
Atnafu Lambebo Tonja | Srija Anand | Emilio Villa-Cueva | Israel Abebe Azime | Jesujoba Oluwadara Alabi | Muhidin A. Mohamed | Debela Desalegn Yadeta | Negasi Haile Abadi | Abigail Oppong | Nnaemeka Casmir Obiefuna | Idris Abdulmumin | Naome A Etori | Eric Peter Wairagala | Kanda Patrick Tshinu | Imanigirimbabazi Emmanuel | Gabofetswe Malema | Alham Fikri Aji | David Ifeoluwa Adelani | Thamar Solorio
Proceedings of the 64th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers)
Africa is home to over one-third of the world’s languages, yet remains severely underrepresented in multimodal AI research. We introduce Afri-MCQA, the first Multilingual Cultural Question-Answering benchmark containing 7.5k Q A pairs across 15 African languages from 12 countries. The benchmark offers parallel text and speech modalities and was entirely created by native speakers. We find that models show poor performance across evaluated cultures, with near-zero accuracy on open-ended VQA when queried through native language or speech. To test linguistic competence, we include control experiments meant to assess this specific aspect separate from cultural knowledge, and we observe significant performance gaps between native languages and English for both text and speech. These findings underscore the pressing need for speech-first approaches, culturally grounded pretraining, and cross-lingual cultural transfer. We release Afri-MCQA to support more inclusive multimodal AI development.
2025
ProverbEval: Exploring LLM Evaluation Challenges for Low-resource Language Understanding
Israel Abebe Azime | Atnafu Lambebo Tonja | Tadesse Destaw Belay | Yonas Chanie | Bontu Fufa Balcha | Negasi Haile Abadi | Henok Biadglign Ademtew | Mulubrhan Abebe Nerea | Debela Desalegn Yadeta | Derartu Dagne Geremew | Assefa Atsbiha Tesfu | Philipp Slusallek | Thamar Solorio | Dietrich Klakow
Findings of the Association for Computational Linguistics: NAACL 2025
Israel Abebe Azime | Atnafu Lambebo Tonja | Tadesse Destaw Belay | Yonas Chanie | Bontu Fufa Balcha | Negasi Haile Abadi | Henok Biadglign Ademtew | Mulubrhan Abebe Nerea | Debela Desalegn Yadeta | Derartu Dagne Geremew | Assefa Atsbiha Tesfu | Philipp Slusallek | Thamar Solorio | Dietrich Klakow
Findings of the Association for Computational Linguistics: NAACL 2025
Viability of Machine Translation for Healthcare in Low-Resourced Languages
Hellina Hailu Nigatu | Nikita Mehandru | Negasi Haile Abadi | Blen Gebremeskel | Ahmed Alaa | Monojit Choudhury
Proceedings of the 2025 Conference on Empirical Methods in Natural Language Processing
Hellina Hailu Nigatu | Nikita Mehandru | Negasi Haile Abadi | Blen Gebremeskel | Ahmed Alaa | Monojit Choudhury
Proceedings of the 2025 Conference on Empirical Methods in Natural Language Processing
Machine Translation errors in high-stakes settings like healthcare pose unique risks that could lead to clinical harm. The challenges are even more pronounced for low-resourced languages where human translators are scarce and MT tools perform poorly. In this work, we provide a taxonomy of Machine Translation errors for the healthcare domain using a publicly available MT system. Preparing an evaluation dataset from pre-existing medical datasets, we conduct our study focusing on two low-resourced languages: Amharic and Tigrinya. Based on our error analysis and findings from prior work, we test two pre-translation interventions–namely, paraphrasing the source sentence and pivoting with a related language– for their effectiveness in reducing clinical risk. We find that MT errors for healthcare most commonly happen when the source sentence includes medical terminology and procedure descriptions, synonyms, figurative language, and word order differences. We find that pre-translation interventions are not effective in reducing clinical risk if the base translation model performs poorly. Based on our findings, we provide recommendations for improving MT for healthcare.
A Case Against Implicit Standards: Homophone Normalization in Machine Translation for Languages that use the Ge’ez Script.
Hellina Hailu Nigatu | Atnafu Lambebo Tonja | Henok Biadglign Ademtew | Hizkiel Mitiku Alemayehu | Negasi Haile Abadi | Tadesse Destaw Belay | Seid Muhie Yimam
Proceedings of the 2025 Conference on Empirical Methods in Natural Language Processing
Hellina Hailu Nigatu | Atnafu Lambebo Tonja | Henok Biadglign Ademtew | Hizkiel Mitiku Alemayehu | Negasi Haile Abadi | Tadesse Destaw Belay | Seid Muhie Yimam
Proceedings of the 2025 Conference on Empirical Methods in Natural Language Processing
Homophone normalization–where characters that have the same sound in a writing script are mapped to one character–is a pre-processing step applied in Amharic Natural Language Processing (NLP) literature. While this may improve performance reported by automatic metrics, it also results in models that are unable to effectively process different forms of writing in a single language. Further, there might be impacts in transfer learning, where models trained on normalized data do not generalize well to other languages. In this paper, we experiment with monolingual training and cross-lingual transfer to understand the impacts of normalization on languages that use the Ge’ez script. We then propose a post-inference intervention in which normalization is applied to model predictions instead of training data. With our simple scheme of post-inference normalization, we show that we can achieve an increase in BLEU score of up to 1.03 while preserving language features in training.
Search
Fix author
Co-authors
- Atnafu Lambebo Tonja 3
- Henok Biadglign Ademtew 2
- Israel Abebe Azime 2
- Tadesse Destaw Belay 2
- Hellina Hailu Nigatu 2
- Thamar Solorio 2
- Debela Desalegn Yadeta 2
- Idris Abdulmumin 1
- David Ifeoluwa Adelani 1
- Alham Fikri Aji 1
- Ahmed Alaa 1
- Jesujoba Alabi 1
- Hizkiel Mitiku Alemayehu 1
- Srija Anand 1
- Bontu Fufa Balcha 1
- Yonas Chanie 1
- Monojit Choudhury 1
- Imanigirimbabazi Emmanuel 1
- Naome A. Etori 1
- Blen Gebremeskel 1
- Derartu Dagne Geremew 1
- Dietrich Klakow 1
- Gabofetswe Malema 1
- Nikita Mehandru 1
- Muhidin A. Mohamed 1
- Mulubrhan Abebe Nerea 1
- Nnaemeka Casmir Obiefuna 1
- Abigail Oppong 1
- Philipp Slusallek 1
- Assefa Atsbiha Tesfu 1
- Kanda Patrick Tshinu 1
- Emilio Villa-Cueva 1
- Eric Peter Wairagala 1
- Seid Muhie Yimam 1