ArCorona: Analyzing Arabic Tweets in the Early Days of Coronavirus (COVID-19) Pandemic

Hamdy Mubarak, Sabit Hassan


Abstract
Over the past few months, there were huge numbers of circulating tweets and discussions about Coronavirus (COVID-19) in the Arab region. It is important for policy makers and many people to identify types of shared tweets to better understand public behavior, topics of interest, requests from governments, sources of tweets, etc. It is also crucial to prevent spreading of rumors and misinformation about the virus or bad cures. To this end, we present the largest manually annotated dataset of Arabic tweets related to COVID-19. We describe annotation guidelines, analyze our dataset and build effective machine learning and transformer based models for classification.
Anthology ID:
2021.louhi-1.1
Volume:
Proceedings of the 12th International Workshop on Health Text Mining and Information Analysis
Month:
April
Year:
2021
Address:
online
Venue:
Louhi
SIG:
Publisher:
Association for Computational Linguistics
Note:
Pages:
1–6
Language:
URL:
https://aclanthology.org/2021.louhi-1.1
DOI:
Bibkey:
Cite (ACL):
Hamdy Mubarak and Sabit Hassan. 2021. ArCorona: Analyzing Arabic Tweets in the Early Days of Coronavirus (COVID-19) Pandemic. In Proceedings of the 12th International Workshop on Health Text Mining and Information Analysis, pages 1–6, online. Association for Computational Linguistics.
Cite (Informal):
ArCorona: Analyzing Arabic Tweets in the Early Days of Coronavirus (COVID-19) Pandemic (Mubarak & Hassan, Louhi 2021)
Copy Citation:
PDF:
https://preview.aclanthology.org/ingestion-script-update/2021.louhi-1.1.pdf