AfroBench: How Good are Large Language Models on African Languages?

Jessica Ojo; Odunayo Ogundepo; Akintunde Oladipo; Kelechi Ogueji; Jimmy Lin; Pontus Stenetorp; David Ifeoluwa Adelani

AfroBench: How Good are Large Language Models on African Languages?

Jessica Ojo, Odunayo Ogundepo, Akintunde Oladipo, Kelechi Ogueji, Jimmy Lin, Pontus Stenetorp, David Ifeoluwa Adelani

Abstract

Large-scale multilingual evaluations, such as MEGA, often include only a handful of African languages due to the scarcity of high-qualityevaluation data and the limited discoverability of existing African datasets. This lack of representation hinders comprehensive LLM evaluation across a diverse range of languages and tasks. To address these challenges, we introduce AFROBENCH—a multi-task benchmark for evaluating the performance of LLMs across 64 African languages, 15 tasks and 22 datasets. AFROBENCH consists of nine natural language understanding datasets, six text generation datasets, six knowledge and question answering tasks, and one mathematical reasoning task. We present results comparing the performance of prompting LLMs to fine-tuned baselines based on BERT and T5-style models. Our results suggest large gaps in performance between high-resource languages, such as English, and African languages across most tasks; but performance also varies based on the availability of monolingual data resources. Our findings confirm that performance on African languages continues to remain a hurdle for current LLMs, underscoring the need for additional efforts to close this gap.

Anthology ID:: 2025.findings-acl.976
Volume:: Findings of the Association for Computational Linguistics: ACL 2025
Month:: July
Year:: 2025
Address:: Vienna, Austria
Editors:: Wanxiang Che, Joyce Nabende, Ekaterina Shutova, Mohammad Taher Pilehvar
Venue:: Findings
SIG:
Publisher:: Association for Computational Linguistics
Note:
Pages:: 19048–19095
Language:
URL:: https://preview.aclanthology.org/display_plenaries/2025.findings-acl.976/
DOI:
Bibkey:
Cite (ACL):: Jessica Ojo, Odunayo Ogundepo, Akintunde Oladipo, Kelechi Ogueji, Jimmy Lin, Pontus Stenetorp, and David Ifeoluwa Adelani. 2025. AfroBench: How Good are Large Language Models on African Languages?. In Findings of the Association for Computational Linguistics: ACL 2025, pages 19048–19095, Vienna, Austria. Association for Computational Linguistics.
Cite (Informal):: AfroBench: How Good are Large Language Models on African Languages? (Ojo et al., Findings 2025)
Copy Citation:
PDF:: https://preview.aclanthology.org/display_plenaries/2025.findings-acl.976.pdf

PDF Cite Search Fix data