Chiamaka Chukwuneke


2023

pdf
MasakhaNEWS: News Topic Classification for African languages
David Ifeoluwa Adelani | Marek Masiak | Israel Abebe Azime | Jesujoba Alabi | Atnafu Lambebo Tonja | Christine Mwase | Odunayo Ogundepo | Bonaventure F. P. Dossou | Akintunde Oladipo | Doreen Nixdorf | Chris Chinenye Emezue | Sana Al-azzawi | Blessing Sibanda | Davis David | Lolwethu Ndolela | Jonathan Mukiibi | Tunde Ajayi | Tatiana Moteu | Brian Odhiambo | Abraham Owodunni | Nnaemeka Obiefuna | Muhidin Mohamed | Shamsuddeen Hassan Muhammad | Teshome Mulugeta Ababu | Saheed Abdullahi Salahudeen | Mesay Gemeda Yigezu | Tajuddeen Gwadabe | Idris Abdulmumin | Mahlet Taye | Oluwabusayo Awoyomi | Iyanuoluwa Shode | Tolulope Adelani | Habiba Abdulganiyu | Abdul-Hakeem Omotayo | Adetola Adeeko | Abeeb Afolabi | Anuoluwapo Aremu | Olanrewaju Samuel | Clemencia Siro | Wangari Kimotho | Onyekachi Ogbu | Chinedu Mbonu | Chiamaka Chukwuneke | Samuel Fanijo | Jessica Ojo | Oyinkansola Awosan | Tadesse Kebede | Toadoum Sari Sakayo | Pamela Nyatsine | Freedmore Sidume | Oreen Yousuf | Mardiyyah Oduwole | Kanda Tshinu | Ussen Kimanuka | Thina Diko | Siyanda Nxakama | Sinodos Nigusse | Abdulmejid Johar | Shafie Mohamed | Fuad Mire Hassan | Moges Ahmed Mehamed | Evrard Ngabire | Jules Jules | Ivan Ssenkungu | Pontus Stenetorp
Proceedings of the 13th International Joint Conference on Natural Language Processing and the 3rd Conference of the Asia-Pacific Chapter of the Association for Computational Linguistics (Volume 1: Long Papers)

pdf
Findings of the 1st Shared Task on Multi-lingual Multi-task Information Retrieval at MRL 2023
Francesco Tinner | David Ifeoluwa Adelani | Chris Emezue | Mammad Hajili | Omer Goldman | Muhammad Farid Adilazuarda | Muhammad Dehan Al Kautsar | Aziza Mirsaidova | Müge Kural | Dylan Massey | Chiamaka Chukwuneke | Chinedu Mbonu | Damilola Oluwaseun Oloyede | Kayode Olaleye | Jonathan Atala | Benjamin A. Ajibade | Saksham Bassi | Rahul Aralikatte | Najoung Kim | Duygu Ataman
Proceedings of the 3rd Workshop on Multi-lingual Representation Learning (MRL)

pdf
Cross-lingual Open-Retrieval Question Answering for African Languages
Odunayo Ogundepo | Tajuddeen Gwadabe | Clara Rivera | Jonathan Clark | Sebastian Ruder | David Adelani | Bonaventure Dossou | Abdou Diop | Claytone Sikasote | Gilles Hacheme | Happy Buzaaba | Ignatius Ezeani | Rooweither Mabuya | Salomey Osei | Chris Emezue | Albert Kahira | Shamsuddeen Muhammad | Akintunde Oladipo | Abraham Owodunni | Atnafu Tonja | Iyanuoluwa Shode | Akari Asai | Anuoluwapo Aremu | Ayodele Awokoya | Bernard Opoku | Chiamaka Chukwuneke | Christine Mwase | Clemencia Siro | Stephen Arthur | Tunde Ajayi | Verrah Otiende | Andre Rubungo | Boyd Sinkala | Daniel Ajisafe | Emeka Onwuegbuzia | Falalu Lawan | Ibrahim Ahmad | Jesujoba Alabi | Chinedu Mbonu | Mofetoluwa Adeyemi | Mofya Phiri | Orevaoghene Ahia | Ruqayya Iro | Sonia Adhiambo
Findings of the Association for Computational Linguistics: EMNLP 2023

African languages have far less in-language content available digitally, making it challenging for question answering systems to satisfy the information needs of users. Cross-lingual open-retrieval question answering (XOR QA) systems – those that retrieve answer content from other languages while serving people in their native language—offer a means of filling this gap. To this end, we create Our Dataset, the first cross-lingual QA dataset with a focus on African languages. Our Dataset includes 12,000+ XOR QA examples across 10 African languages. While previous datasets have focused primarily on languages where cross-lingual QA augments coverage from the target language, Our Dataset focuses on languages where cross-lingual answer content is the only high-coverage source of answer content. Because of this, we argue that African languages are one of the most important and realistic use cases for XOR QA. Our experiments demonstrate the poor performance of automatic translation and multilingual retrieval methods. Overall, Our Dataset proves challenging for state-of-the-art QA models. We hope that the dataset enables the development of more equitable QA technology.

2022

pdf
IgboBERT Models: Building and Training Transformer Models for the Igbo Language
Chiamaka Chukwuneke | Ignatius Ezeani | Paul Rayson | Mahmoud El-Haj
Proceedings of the Thirteenth Language Resources and Evaluation Conference

This work presents a standard Igbo named entity recognition (IgboNER) dataset as well as the results from training and fine-tuning state-of-the-art transformer IgboNER models. We discuss the process of our dataset creation - data collection and annotation and quality checking. We also present experimental processes involved in building an IgboBERT language model from scratch as well as fine-tuning it along with other non-Igbo pre-trained models for the downstream IgboNER task. Our results show that, although the IgboNER task benefited hugely from fine-tuning large transformer model, fine-tuning a transformer model built from scratch with comparatively little Igbo text data seems to yield quite decent results for the IgboNER task. This work will contribute immensely to IgboNLP in particular as well as the wider African and low-resource NLP efforts Keywords: Igbo, named entity recognition, BERT models, under-resourced, dataset

2021

pdf
MasakhaNER: Named Entity Recognition for African Languages
David Ifeoluwa Adelani | Jade Abbott | Graham Neubig | Daniel D’souza | Julia Kreutzer | Constantine Lignos | Chester Palen-Michel | Happy Buzaaba | Shruti Rijhwani | Sebastian Ruder | Stephen Mayhew | Israel Abebe Azime | Shamsuddeen H. Muhammad | Chris Chinenye Emezue | Joyce Nakatumba-Nabende | Perez Ogayo | Aremu Anuoluwapo | Catherine Gitau | Derguene Mbaye | Jesujoba Alabi | Seid Muhie Yimam | Tajuddeen Rabiu Gwadabe | Ignatius Ezeani | Rubungo Andre Niyongabo | Jonathan Mukiibi | Verrah Otiende | Iroro Orife | Davis David | Samba Ngom | Tosin Adewumi | Paul Rayson | Mofetoluwa Adeyemi | Gerald Muriuki | Emmanuel Anebi | Chiamaka Chukwuneke | Nkiruka Odu | Eric Peter Wairagala | Samuel Oyerinde | Clemencia Siro | Tobius Saul Bateesa | Temilola Oloyede | Yvonne Wambui | Victor Akinode | Deborah Nabagereka | Maurice Katusiime | Ayodele Awokoya | Mouhamadane MBOUP | Dibora Gebreyohannes | Henok Tilaye | Kelechi Nwaike | Degaga Wolde | Abdoulaye Faye | Blessing Sibanda | Orevaoghene Ahia | Bonaventure F. P. Dossou | Kelechi Ogueji | Thierno Ibrahima DIOP | Abdoulaye Diallo | Adewale Akinfaderin | Tendai Marengereke | Salomey Osei
Transactions of the Association for Computational Linguistics, Volume 9

We take a step towards addressing the under- representation of the African continent in NLP research by bringing together different stakeholders to create the first large, publicly available, high-quality dataset for named entity recognition (NER) in ten African languages. We detail the characteristics of these languages to help researchers and practitioners better understand the challenges they pose for NER tasks. We analyze our datasets and conduct an extensive empirical evaluation of state- of-the-art methods across both supervised and transfer learning settings. Finally, we release the data, code, and models to inspire future research on African NLP.1
Search
Co-authors
Venues