Abstract
We present our deep leaning system submitted to MADAR shared task 2 focused on twitter user dialect identification. We develop tweet-level identification models based on GRUs and BERT in supervised and semi-supervised set-tings. We then introduce a simple, yet effective, method of porting tweet-level labels at the level of users. Our system ranks top 1 in the competition, with 71.70% macro F1 score and 77.40% accuracy.- Anthology ID:
 - W19-4637
 - Volume:
 - Proceedings of the Fourth Arabic Natural Language Processing Workshop
 - Month:
 - August
 - Year:
 - 2019
 - Address:
 - Florence, Italy
 - Editors:
 - Wassim El-Hajj, Lamia Hadrich Belguith, Fethi Bougares, Walid Magdy, Imed Zitouni, Nadi Tomeh, Mahmoud El-Haj, Wajdi Zaghouani
 - Venue:
 - WANLP
 - SIG:
 - Publisher:
 - Association for Computational Linguistics
 - Note:
 - Pages:
 - 279–284
 - Language:
 - URL:
 - https://aclanthology.org/W19-4637
 - DOI:
 - 10.18653/v1/W19-4637
 - Cite (ACL):
 - Chiyu Zhang and Muhammad Abdul-Mageed. 2019. No Army, No Navy: BERT Semi-Supervised Learning of Arabic Dialects. In Proceedings of the Fourth Arabic Natural Language Processing Workshop, pages 279–284, Florence, Italy. Association for Computational Linguistics.
 - Cite (Informal):
 - No Army, No Navy: BERT Semi-Supervised Learning of Arabic Dialects (Zhang & Abdul-Mageed, WANLP 2019)
 - PDF:
 - https://preview.aclanthology.org/ingest-acl-2023-videos/W19-4637.pdf