Abstract
This paper presents the systems submitted by the MAZA team to the Arabic Dialect Identification (ADI) shared task at the VarDial Evaluation Campaign 2017. The goal of the task is to evaluate computational models to identify the dialect of Arabic utterances using both audio and text transcriptions. The ADI shared task dataset included Modern Standard Arabic (MSA) and four Arabic dialects: Egyptian, Gulf, Levantine, and North-African. The three systems submitted by MAZA are based on combinations of multiple machine learning classifiers arranged as (1) voting ensemble; (2) mean probability ensemble; (3) meta-classifier. The best results were obtained by the meta-classifier achieving 71.7% accuracy, ranking second among the six teams which participated in the ADI shared task.- Anthology ID:
- W17-1222
- Volume:
- Proceedings of the Fourth Workshop on NLP for Similar Languages, Varieties and Dialects (VarDial)
- Month:
- April
- Year:
- 2017
- Address:
- Valencia, Spain
- Venue:
- VarDial
- SIG:
- Publisher:
- Association for Computational Linguistics
- Note:
- Pages:
- 178–183
- Language:
- URL:
- https://aclanthology.org/W17-1222
- DOI:
- 10.18653/v1/W17-1222
- Cite (ACL):
- Shervin Malmasi and Marcos Zampieri. 2017. Arabic Dialect Identification Using iVectors and ASR Transcripts. In Proceedings of the Fourth Workshop on NLP for Similar Languages, Varieties and Dialects (VarDial), pages 178–183, Valencia, Spain. Association for Computational Linguistics.
- Cite (Informal):
- Arabic Dialect Identification Using iVectors and ASR Transcripts (Malmasi & Zampieri, VarDial 2017)
- PDF:
- https://preview.aclanthology.org/paclic-22-ingestion/W17-1222.pdf