Challenging Multimodal LLMs with African Standardized Exams: A Document VQA Evaluation

Victor Tolulope Olufemi, Oreoluwa Boluwatife Babatunde, Emmanuel Bolarinwa, Kausar Yetunde Moshood


Abstract
Despite rapid advancements in multimodal large language models (MLLMs), their ability to process low-resource African languages in document-based visual question answering (VQA) tasks remains limited. This paper evaluates three state-of-the-art MLLMs—GPT-4o, Claude-3.5 Haiku, and Gemini-1.5 Pro—on WAEC/NECO standardized exam questions in Yoruba, Igbo, and Hausa. We curate a dataset of multiple-choice questions from exam images and compare model accuracies across two prompting strategies: (1) using English prompts for African language questions, and (2) using native-language prompts. While GPT-4o achieves over 90% accuracy for English, performance drops below 40% for African languages, highlighting severe data imbalance in model training. Notably, native-language prompting improves accuracy for most models, yet no system approaches human-level performance, which reaches over 50% in Yoruba, Igbo, and Hausa. These findings emphasize the need for diverse training data, fine-tuning, and dedicated benchmarks that address the linguistic intricacies of African languages in multimodal tasks, paving the way for more equitable and effective AI systems in education.
Anthology ID:
2025.africanlp-1.22
Volume:
Proceedings of the Sixth Workshop on African Natural Language Processing (AfricaNLP 2025)
Month:
July
Year:
2025
Address:
Vienna, Austria
Editors:
Constantine Lignos, Idris Abdulmumin, David Adelani
Venues:
AfricaNLP | WS
SIG:
Publisher:
Association for Computational Linguistics
Note:
Pages:
150–157
Language:
URL:
https://preview.aclanthology.org/landing_page/2025.africanlp-1.22/
DOI:
10.18653/v1/2025.africanlp-1.22
Bibkey:
Cite (ACL):
Victor Tolulope Olufemi, Oreoluwa Boluwatife Babatunde, Emmanuel Bolarinwa, and Kausar Yetunde Moshood. 2025. Challenging Multimodal LLMs with African Standardized Exams: A Document VQA Evaluation. In Proceedings of the Sixth Workshop on African Natural Language Processing (AfricaNLP 2025), pages 150–157, Vienna, Austria. Association for Computational Linguistics.
Cite (Informal):
Challenging Multimodal LLMs with African Standardized Exams: A Document VQA Evaluation (Olufemi et al., AfricaNLP 2025)
Copy Citation:
PDF:
https://preview.aclanthology.org/landing_page/2025.africanlp-1.22.pdf