Gabofetswe Malema
2026
Afri-MCQA: Multimodal Cultural Question Answering for African Languages
Atnafu Lambebo Tonja | Srija Anand | Emilio Villa-Cueva | Israel Abebe Azime | Jesujoba Oluwadara Alabi | Muhidin A. Mohamed | Debela Desalegn Yadeta | Negasi Haile Abadi | Abigail Oppong | Nnaemeka Casmir Obiefuna | Idris Abdulmumin | Naome A Etori | Eric Peter Wairagala | Kanda Patrick Tshinu | Imanigirimbabazi Emmanuel | Gabofetswe Malema | Alham Fikri Aji | David Ifeoluwa Adelani | Thamar Solorio
Proceedings of the 64th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers)
Atnafu Lambebo Tonja | Srija Anand | Emilio Villa-Cueva | Israel Abebe Azime | Jesujoba Oluwadara Alabi | Muhidin A. Mohamed | Debela Desalegn Yadeta | Negasi Haile Abadi | Abigail Oppong | Nnaemeka Casmir Obiefuna | Idris Abdulmumin | Naome A Etori | Eric Peter Wairagala | Kanda Patrick Tshinu | Imanigirimbabazi Emmanuel | Gabofetswe Malema | Alham Fikri Aji | David Ifeoluwa Adelani | Thamar Solorio
Proceedings of the 64th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers)
Africa is home to over one-third of the world’s languages, yet remains severely underrepresented in multimodal AI research. We introduce Afri-MCQA, the first Multilingual Cultural Question-Answering benchmark containing 7.5k Q A pairs across 15 African languages from 12 countries. The benchmark offers parallel text and speech modalities and was entirely created by native speakers. We find that models show poor performance across evaluated cultures, with near-zero accuracy on open-ended VQA when queried through native language or speech. To test linguistic competence, we include control experiments meant to assess this specific aspect separate from cultural knowledge, and we observe significant performance gaps between native languages and English for both text and speech. These findings underscore the pressing need for speech-first approaches, culturally grounded pretraining, and cross-lingual cultural transfer. We release Afri-MCQA to support more inclusive multimodal AI development.
2020
Complex Setswana Parts of Speech Tagging
Gabofetswe Malema | Boago Okgetheng | Bopaki Tebalo | Moffat Motlhanka | Goaletsa Rammidi
Proceedings of the first workshop on Resources for African Indigenous Languages
Gabofetswe Malema | Boago Okgetheng | Bopaki Tebalo | Moffat Motlhanka | Goaletsa Rammidi
Proceedings of the first workshop on Resources for African Indigenous Languages
Setswana language is one of the Bantu languages written disjunctively. Some of its parts of speech such as qualificatives and some adverbs are made up of multiple words. That is, the part of speech is made up of a group of words. The disjunctive style of writing poses a challenge when a sentence is tokenized or when tagging. A few studies have been done on identification of multi-word parts of speech. In this study we go further to tokenize complex parts of speech which are formed by extending basic forms of multi-word parts of speech. The parts of speech are extended by recursively concatenating more parts of speech to a basic form of parts of speech. We developed rules for building complex relative parts of speech. A morphological analyzer and Python NLTK are used to tag individual words and basic forms of multi-word parts of speech. Developed rules are then used to identify complex parts of speech. Results from a 300 sentence text files give a performance of 74%. The tagger fails when it encounters expansion rules not implemented and when tagging by the morphological analyzer is incorrect.
Search
Fix author
Co-authors
- Negasi Haile Abadi 1
- Idris Abdulmumin 1
- David Ifeoluwa Adelani 1
- Alham Fikri Aji 1
- Jesujoba Alabi 1
- Srija Anand 1
- Israel Abebe Azime 1
- Imanigirimbabazi Emmanuel 1
- Naome A. Etori 1
- Muhidin A. Mohamed 1
- Moffat Motlhanka 1
- Nnaemeka Casmir Obiefuna 1
- Boago Okgetheng 1
- Abigail Oppong 1
- Goaletsa Rammidi 1
- Thamar Solorio 1
- Bopaki Tebalo 1
- Atnafu Lambebo Tonja 1
- Kanda Patrick Tshinu 1
- Emilio Villa-Cueva 1
- Eric Peter Wairagala 1
- Debela Desalegn Yadeta 1