NADI 2020: The First Nuanced Arabic Dialect Identification Shared Task

Muhammad Abdul-Mageed; Chiyu Zhang; Houda Bouamor; Nizar Habash

NADI 2020: The First Nuanced Arabic Dialect Identification Shared Task

Muhammad Abdul-Mageed, Chiyu Zhang, Houda Bouamor, Nizar Habash

Abstract

We present the results and findings of the First Nuanced Arabic Dialect Identification Shared Task (NADI). This Shared Task includes two subtasks: country-level dialect identification (Subtask 1) and province-level sub-dialect identification (Subtask 2). The data for the shared task covers a total of 100 provinces from 21 Arab countries and is collected from the Twitter domain. As such, NADI is the first shared task to target naturally-occurring fine-grained dialectal text at the sub-country level. A total of 61 teams from 25 countries registered to participate in the tasks, thus reflecting the interest of the community in this area. We received 47 submissions for Subtask 1 from 18 teams and 9 submissions for Subtask 2 from 9 teams.

Anthology ID:: 2020.wanlp-1.9
Volume:: Proceedings of the Fifth Arabic Natural Language Processing Workshop
Month:: December
Year:: 2020
Address:: Barcelona, Spain (Online)
Venue:: WANLP
SIG:
Publisher:: Association for Computational Linguistics
Note:
Pages:: 97–110
Language:
URL:: https://aclanthology.org/2020.wanlp-1.9
DOI:
Bibkey:
Cite (ACL):: Muhammad Abdul-Mageed, Chiyu Zhang, Houda Bouamor, and Nizar Habash. 2020. NADI 2020: The First Nuanced Arabic Dialect Identification Shared Task. In Proceedings of the Fifth Arabic Natural Language Processing Workshop, pages 97–110, Barcelona, Spain (Online). Association for Computational Linguistics.
Cite (Informal):: NADI 2020: The First Nuanced Arabic Dialect Identification Shared Task (Abdul-Mageed et al., WANLP 2020)
Copy Citation:
PDF:: https://preview.aclanthology.org/starsem-semeval-split/2020.wanlp-1.9.pdf

PDF Search