Ahmad Beltagy


2020

pdf
Arabic Dialect Identification Using BERT-Based Domain Adaptation
Ahmad Beltagy | Abdelrahman Abouelenin | Omar ElSherief
Proceedings of the Fifth Arabic Natural Language Processing Workshop

Arabic is one of the most important and growing languages in the world. With the rise of the social media giants like Twitter, Arabic spoken dialects have become more in use. In this paper we describe our effort and simple approach on the NADI Shared Task 1 that requires us to build a system to differentiate between different 21 Arabic dialects, we introduce a deep learning semisupervised fashion approach along with pre-processing that was reported on NADI shared Task 1 Corpus. Our system ranks 4th in NADI’s shared task competition achieving 23.09% F1 macro average score with a very simple yet an efficient approach on differentiating between 21 Arabic Dialects given tweets.