Abstract
The objective of this work was the introduction of an effective approach based on the AraBERT language model for fighting Tweets COVID-19 Infodemic. It was arranged in the form of a two-step pipeline, where the first step involved a series of pre-processing procedures to transform Twitter jargon, including emojis and emoticons, into plain text, and the second step exploited a version of AraBERT, which was pre-trained on plain text, to fine-tune and classify the tweets with respect to their Label. The use of language models pre-trained on plain texts rather than on tweets was motivated by the necessity to address two critical issues shown by the scientific literature, namely (1) pre-trained language models are widely available in many languages, avoiding the time-consuming and resource-intensive model training directly on tweets from scratch, allowing to focus only on their fine-tuning; (2) available plain text corpora are larger than tweet-only ones, allowing for better performance.- Anthology ID:
- 2021.nlp4if-1.13
- Volume:
- Proceedings of the Fourth Workshop on NLP for Internet Freedom: Censorship, Disinformation, and Propaganda
- Month:
- June
- Year:
- 2021
- Address:
- Online
- Editors:
- Anna Feldman, Giovanni Da San Martino, Chris Leberknight, Preslav Nakov
- Venue:
- NLP4IF
- SIG:
- Publisher:
- Association for Computational Linguistics
- Note:
- Pages:
- 93–98
- Language:
- URL:
- https://aclanthology.org/2021.nlp4if-1.13
- DOI:
- 10.18653/v1/2021.nlp4if-1.13
- Cite (ACL):
- Ahmad Hussein, Nada Ghneim, and Ammar Joukhadar. 2021. DamascusTeam at NLP4IF2021: Fighting the Arabic COVID-19 Infodemic on Twitter Using AraBERT. In Proceedings of the Fourth Workshop on NLP for Internet Freedom: Censorship, Disinformation, and Propaganda, pages 93–98, Online. Association for Computational Linguistics.
- Cite (Informal):
- DamascusTeam at NLP4IF2021: Fighting the Arabic COVID-19 Infodemic on Twitter Using AraBERT (Hussein et al., NLP4IF 2021)
- PDF:
- https://preview.aclanthology.org/naacl24-info/2021.nlp4if-1.13.pdf