Abstract
In this paper, we discuss our team’s work on the NADI Shared Task. The task requires classifying Arabic tweets among 21 dialects. We tested out different approaches, and the best one was the simplest one. Our best submission was using Multinational Naive Bayes (MNB) classifier (Small and Hsiao, 1985) with n-grams as features. Despite its simplicity, this classifier shows better results than complicated models such as BERT. Our best submitted score was 17% F1-score and 35% accuracy.- Anthology ID:
- 2020.wanlp-1.22
- Volume:
- Proceedings of the Fifth Arabic Natural Language Processing Workshop
- Month:
- December
- Year:
- 2020
- Address:
- Barcelona, Spain (Online)
- Editors:
- Imed Zitouni, Muhammad Abdul-Mageed, Houda Bouamor, Fethi Bougares, Mahmoud El-Haj, Nadi Tomeh, Wajdi Zaghouani
- Venue:
- WANLP
- SIG:
- Publisher:
- Association for Computational Linguistics
- Note:
- Pages:
- 237–242
- Language:
- URL:
- https://aclanthology.org/2020.wanlp-1.22
- DOI:
- Cite (ACL):
- Mutaz Younes, Nour Al-khdour, and Mohammad AL-Smadi. 2020. Team Alexa at NADI Shared Task. In Proceedings of the Fifth Arabic Natural Language Processing Workshop, pages 237–242, Barcelona, Spain (Online). Association for Computational Linguistics.
- Cite (Informal):
- Team Alexa at NADI Shared Task (Younes et al., WANLP 2020)
- PDF:
- https://preview.aclanthology.org/naacl24-info/2020.wanlp-1.22.pdf