Abstract
In this paper, we approach the shared task OffenseEval 2020 by Mubarak et al. (2020) using ULMFiT Howard and Ruder (2018) pre-trained on Arabic Wikipedia Khooli (2019) which we use as a starting point and use the target data-set to fine-tune it. The data set of the task is highly imbalanced. We train forward and backward models and ensemble the results. We report confusion matrix, accuracy, precision, recall and F1 of the development set and report summarized results of the test set. Transfer learning method using ULMFiT shows potential for Arabic text classification. Mubarak, K. Darwish,W. Magdy, T. Elsayed, and H. Al-Khalifa. Overview of osact4 arabic offensive language detection shared task. 4, 2020. Howard and S. Ruder. Universal language model fine-tuning for text classification. arXiv preprint arXiv:1801.06146, 2018. Khooli. Applied data science. https://github.com/abedkhooli/ds2, 2019.- Anthology ID:
- 2020.osact-1.13
- Volume:
- Proceedings of the 4th Workshop on Open-Source Arabic Corpora and Processing Tools, with a Shared Task on Offensive Language Detection
- Month:
- May
- Year:
- 2020
- Address:
- Marseille, France
- Editors:
- Hend Al-Khalifa, Walid Magdy, Kareem Darwish, Tamer Elsayed, Hamdy Mubarak
- Venue:
- OSACT
- SIG:
- Publisher:
- European Language Resource Association
- Note:
- Pages:
- 82–85
- Language:
- English
- URL:
- https://aclanthology.org/2020.osact-1.13
- DOI:
- Cite (ACL):
- Mohamed Abdellatif and Ahmed Elgammal. 2020. Offensive language detection in Arabic using ULMFiT. In Proceedings of the 4th Workshop on Open-Source Arabic Corpora and Processing Tools, with a Shared Task on Offensive Language Detection, pages 82–85, Marseille, France. European Language Resource Association.
- Cite (Informal):
- Offensive language detection in Arabic using ULMFiT (Abdellatif & Elgammal, OSACT 2020)
- PDF:
- https://preview.aclanthology.org/nschneid-patch-3/2020.osact-1.13.pdf
- Code
- abedkhooli/ds2