Alcides Alcoba Inciarte


2023

pdf
SERENGETI: Massively Multilingual Language Models for Africa
Ife Adebara | AbdelRahim Elmadany | Muhammad Abdul-Mageed | Alcides Alcoba Inciarte
Findings of the Association for Computational Linguistics: ACL 2023

Multilingual pretrained language models (mPLMs) acquire valuable, generalizable linguistic information during pretraining and have advanced the state of the art on task-specific finetuning. To date, only ~31 out of ~2,000 African languages are covered in existing language models. We ameliorate this limitation by developing SERENGETI, a set of massively multilingual language model that covers 517 African languages and language varieties. We evaluate our novel models on eight natural language understanding tasks across 20 datasets, comparing to 4 mPLMs that cover 4-23 African languages. SERENGETI outperforms other models on 11 datasets across the eights tasks, achieving 82.27 average F_1. We also perform analyses of errors from our models, which allows us to investigate the influence of language genealogy and linguistic similarity when the models are applied under zero-shot settings. We will publicly release our models for research. Anonymous link

pdf
SIDLR: Slot and Intent Detection Models for Low-Resource Language Varieties
Sang Yun Kwon | Gagan Bhatia | Elmoatez Billah Nagoudi | Alcides Alcoba Inciarte | Muhammad Abdul-mageed
Tenth Workshop on NLP for Similar Languages, Varieties and Dialects (VarDial 2023)

Intent detection and slot filling are two critical tasks in spoken and natural language understandingfor task-oriented dialog systems. In this work, we describe our participation in slot and intent detection for low-resource language varieties (SID4LR) (Aepli et al., 2023). We investigate the slot and intent detection (SID) tasks using a wide range of models and settings. Given the recent success of multitask promptedfinetuning of the large language models, we also test the generalization capability of the recent encoder-decoder model mT0 (Muennighoff et al., 2022) on new tasks (i.e., SID) in languages they have never intentionally seen. We show that our best model outperforms the baseline by a large margin (up to +30 F1 points) in both SID tasks.