iCompass Working Notes for the Nuanced Arabic Dialect Identification Shared task

Abir Messaoudi, Chayma Fourati, Hatem Haddad, Moez BenHajhmida


Abstract
We describe our submitted system to the Nuanced Arabic Dialect Identification (NADI) shared task. We tackled only the first subtask (Subtask 1). We used state-of-the-art Deep Learning models and pre-trained contextualized text representation models that we finetuned according to the downstream task in hand. As a first approach, we used BERT Arabic variants: MARBERT with its two versions MARBERT v1 and MARBERT v2, we combined MARBERT embeddings with a CNN classifier, and finally, we tested the Quasi-Recurrent Neural Networks (QRNN) model. The results found show that version 2 of MARBERT outperforms all of the previously mentioned models on Subtask 1.
Anthology ID:
2022.wanlp-1.41
Volume:
Proceedings of the Seventh Arabic Natural Language Processing Workshop (WANLP)
Month:
December
Year:
2022
Address:
Abu Dhabi, United Arab Emirates (Hybrid)
Editors:
Houda Bouamor, Hend Al-Khalifa, Kareem Darwish, Owen Rambow, Fethi Bougares, Ahmed Abdelali, Nadi Tomeh, Salam Khalifa, Wajdi Zaghouani
Venue:
WANLP
SIG:
Publisher:
Association for Computational Linguistics
Note:
Pages:
415–419
Language:
URL:
https://aclanthology.org/2022.wanlp-1.41
DOI:
10.18653/v1/2022.wanlp-1.41
Bibkey:
Cite (ACL):
Abir Messaoudi, Chayma Fourati, Hatem Haddad, and Moez BenHajhmida. 2022. iCompass Working Notes for the Nuanced Arabic Dialect Identification Shared task. In Proceedings of the Seventh Arabic Natural Language Processing Workshop (WANLP), pages 415–419, Abu Dhabi, United Arab Emirates (Hybrid). Association for Computational Linguistics.
Cite (Informal):
iCompass Working Notes for the Nuanced Arabic Dialect Identification Shared task (Messaoudi et al., WANLP 2022)
Copy Citation:
PDF:
https://preview.aclanthology.org/landing_page/2022.wanlp-1.41.pdf