No Army, No Navy: BERT Semi-Supervised Learning of Arabic Dialects

Chiyu Zhang; Muhammad Abdul-Mageed

doi:10.18653/v1/W19-4637

No Army, No Navy: BERT Semi-Supervised Learning of Arabic Dialects

Abstract

We present our deep leaning system submitted to MADAR shared task 2 focused on twitter user dialect identification. We develop tweet-level identification models based on GRUs and BERT in supervised and semi-supervised set-tings. We then introduce a simple, yet effective, method of porting tweet-level labels at the level of users. Our system ranks top 1 in the competition, with 71.70% macro F1 score and 77.40% accuracy.

Anthology ID:: W19-4637
Volume:: Proceedings of the Fourth Arabic Natural Language Processing Workshop
Month:: August
Year:: 2019
Address:: Florence, Italy
Venue:: WANLP
SIG:
Publisher:: Association for Computational Linguistics
Note:
Pages:: 279–284
Language:
URL:: https://aclanthology.org/W19-4637
DOI:: 10.18653/v1/W19-4637
Bibkey:
Cite (ACL):: Chiyu Zhang and Muhammad Abdul-Mageed. 2019. No Army, No Navy: BERT Semi-Supervised Learning of Arabic Dialects. In Proceedings of the Fourth Arabic Natural Language Processing Workshop, pages 279–284, Florence, Italy. Association for Computational Linguistics.
Cite (Informal):: No Army, No Navy: BERT Semi-Supervised Learning of Arabic Dialects (Zhang & Abdul-Mageed, WANLP 2019)
Copy Citation:
PDF:: https://preview.aclanthology.org/ingestion-script-update/W19-4637.pdf

PDF Search