Abstract
This paper describes the approach adopted by the SMarT research group to build a dialect identification system in the framework of the Madar shared task on Arabic fine-grained dialect identification. We experimented several approaches, but we finally decided to use a Multinomial Naive Bayes classifier based on word and character ngrams in addition to the language model probabilities. We achieved a score of 67.73% in terms of Macro accuracy and a macro-averaged F1-score of 67.31%- Anthology ID:
- W19-4633
- Volume:
- Proceedings of the Fourth Arabic Natural Language Processing Workshop
- Month:
- August
- Year:
- 2019
- Address:
- Florence, Italy
- Editors:
- Wassim El-Hajj, Lamia Hadrich Belguith, Fethi Bougares, Walid Magdy, Imed Zitouni, Nadi Tomeh, Mahmoud El-Haj, Wajdi Zaghouani
- Venue:
- WANLP
- SIG:
- Publisher:
- Association for Computational Linguistics
- Note:
- Pages:
- 259–263
- Language:
- URL:
- https://aclanthology.org/W19-4633
- DOI:
- 10.18653/v1/W19-4633
- Cite (ACL):
- Karima Meftouh, Karima Abidi, Salima Harrat, and Kamel Smaili. 2019. The SMarT Classifier for Arabic Fine-Grained Dialect Identification. In Proceedings of the Fourth Arabic Natural Language Processing Workshop, pages 259–263, Florence, Italy. Association for Computational Linguistics.
- Cite (Informal):
- The SMarT Classifier for Arabic Fine-Grained Dialect Identification (Meftouh et al., WANLP 2019)
- PDF:
- https://preview.aclanthology.org/fix-dup-bibkey/W19-4633.pdf