AfriCaption: Establishing a New Paradigm for Image Captioning in African Languages
Mardiyyah Oduwole, Prince Mireku, Fatimo Adebanjo, Oluwatosin Olajide, Mahi Aminu Aliyu, Jekaterina Novikova
Abstract
Multimodal AI research has overwhelmingly focused on high-resource languages, hindering the democratization of advancements in the field. To address this, we present AfriCaption, a comprehensive framework for multilingual image captioning in 20 African languages and our contributions are threefold: (i) a curated dataset built on Flickr8k, featuring semantically aligned captions generated via a context-aware selection and translation process; (ii) a dynamic, context-preserving pipeline that ensures ongoing quality through model ensembling and adaptive substitution; and (iii) the AfriCaption model, a 0.5B parametervision-to-text architecture that integrates SigLIP and NLLB200 for caption generation across underrepresented languages. This unified framework ensures ongoing data quality and establishes the first scalable image-captioning resource for underrepresented African languages, laying the groundwork for truly inclusive multimodal AI.- Anthology ID:
- 2026.africanlp-main.5
- Volume:
- Proceedings of the 7th Workshop on African Natural Language Processing (AfricaNLP 2026)
- Month:
- March
- Year:
- 2026
- Address:
- Rabat, Morocco
- Editors:
- Everlyn Asiko Chimoto, Constantine Lignos, Shamsuddeen Muhammad, Idris Abdulmumin, Clemencia Siro, David Ifeoluwa Adelani
- Venues:
- AfricaNLP | WS
- SIG:
- Publisher:
- Association for Computational Linguistics
- Note:
- Pages:
- 44–55
- Language:
- URL:
- https://preview.aclanthology.org/manual-author-scripts/2026.africanlp-main.5/
- DOI:
- Cite (ACL):
- Mardiyyah Oduwole, Prince Mireku, Fatimo Adebanjo, Oluwatosin Olajide, Mahi Aminu Aliyu, and Jekaterina Novikova. 2026. AfriCaption: Establishing a New Paradigm for Image Captioning in African Languages. In Proceedings of the 7th Workshop on African Natural Language Processing (AfricaNLP 2026), pages 44–55, Rabat, Morocco. Association for Computational Linguistics.
- Cite (Informal):
- AfriCaption: Establishing a New Paradigm for Image Captioning in African Languages (Oduwole et al., AfricaNLP 2026)
- PDF:
- https://preview.aclanthology.org/manual-author-scripts/2026.africanlp-main.5.pdf