Hierarchical Deep Learning for Arabic Dialect Identification
Gael de Francony, Victor Guichard, Praveen Joshi, Haithem Afli, Abdessalam Bouchekif
Abstract
In this paper, we present two approaches for Arabic Fine-Grained Dialect Identification. The first approach is based on Recurrent Neural Networks (BLSTM, BGRU) using hierarchical classification. The main idea is to separate the classification process for a sentence from a given text in two stages. We start with a higher level of classification (8 classes) and then the finer-grained classification (26 classes). The second approach is given by a voting system based on Naive Bayes and Random Forest. Our system achieves an F1 score of 63.02 % on the subtask evaluation dataset.- Anthology ID:
- W19-4631
- Volume:
- Proceedings of the Fourth Arabic Natural Language Processing Workshop
- Month:
- August
- Year:
- 2019
- Address:
- Florence, Italy
- Editors:
- Wassim El-Hajj, Lamia Hadrich Belguith, Fethi Bougares, Walid Magdy, Imed Zitouni, Nadi Tomeh, Mahmoud El-Haj, Wajdi Zaghouani
- Venue:
- WANLP
- SIG:
- Publisher:
- Association for Computational Linguistics
- Note:
- Pages:
- 249–253
- Language:
- URL:
- https://preview.aclanthology.org/ingest_wac_2008/W19-4631/
- DOI:
- 10.18653/v1/W19-4631
- Cite (ACL):
- Gael de Francony, Victor Guichard, Praveen Joshi, Haithem Afli, and Abdessalam Bouchekif. 2019. Hierarchical Deep Learning for Arabic Dialect Identification. In Proceedings of the Fourth Arabic Natural Language Processing Workshop, pages 249–253, Florence, Italy. Association for Computational Linguistics.
- Cite (Informal):
- Hierarchical Deep Learning for Arabic Dialect Identification (de Francony et al., WANLP 2019)
- PDF:
- https://preview.aclanthology.org/ingest_wac_2008/W19-4631.pdf