Alcides Alcoba Inciarte
2023
SERENGETI: Massively Multilingual Language Models for Africa
Ife Adebara
|
AbdelRahim Elmadany
|
Muhammad Abdul-Mageed
|
Alcides Alcoba Inciarte
Findings of the Association for Computational Linguistics: ACL 2023
Multilingual pretrained language models (mPLMs) acquire valuable, generalizable linguistic information during pretraining and have advanced the state of the art on task-specific finetuning. To date, only ~31 out of ~2,000 African languages are covered in existing language models. We ameliorate this limitation by developing SERENGETI, a set of massively multilingual language model that covers 517 African languages and language varieties. We evaluate our novel models on eight natural language understanding tasks across 20 datasets, comparing to 4 mPLMs that cover 4-23 African languages. SERENGETI outperforms other models on 11 datasets across the eights tasks, achieving 82.27 average F_1. We also perform analyses of errors from our models, which allows us to investigate the influence of language genealogy and linguistic similarity when the models are applied under zero-shot settings. We will publicly release our models for research. Anonymous link
SIDLR: Slot and Intent Detection Models for Low-Resource Language Varieties
Sang Yun Kwon
|
Gagan Bhatia
|
Elmoatez Billah Nagoudi
|
Alcides Alcoba Inciarte
|
Muhammad Abdul-mageed
Tenth Workshop on NLP for Similar Languages, Varieties and Dialects (VarDial 2023)
Intent detection and slot filling are two critical tasks in spoken and natural language understandingfor task-oriented dialog systems. In this work, we describe our participation in slot and intent detection for low-resource language varieties (SID4LR) (Aepli et al., 2023). We investigate the slot and intent detection (SID) tasks using a wide range of models and settings. Given the recent success of multitask promptedfinetuning of the large language models, we also test the generalization capability of the recent encoder-decoder model mT0 (Muennighoff et al., 2022) on new tasks (i.e., SID) in languages they have never intentionally seen. We show that our best model outperforms the baseline by a large margin (up to +30 F1 points) in both SID tasks.
Search
Co-authors
- Muhammad Abdul-Mageed 2
- Ife Adebara 1
- Abdelrahim Elmadany 1
- Sang Yun Kwon 1
- Gagan Bhatia 1
- show all...