Two Stage Transformer Model for COVID-19 Fake News Detection and Fact Checking
Rutvik Vijjali, Prathyush Potluri, Siddharth Kumar, Sundeep Teki
Abstract
The rapid advancement of technology in online communication via social media platforms has led to a prolific rise in the spread of misinformation and fake news. Fake news is especially rampant in the current COVID-19 pandemic, leading to people believing in false and potentially harmful claims and stories. Detecting fake news quickly can alleviate the spread of panic, chaos and potential health hazards. We developed a two stage automated pipeline for COVID-19 fake news detection using state of the art machine learning models for natural language processing. The first model leverages a novel fact checking algorithm that retrieves the most relevant facts concerning user queries about particular COVID-19 claims. The second model verifies the level of “truth” in the queried claim by computing the textual entailment between the claim and the true facts retrieved from a manually curated COVID-19 dataset. The dataset is based on a publicly available knowledge source consisting of more than 5000 COVID-19 false claims and verified explanations, a subset of which was internally annotated and cross-validated to train and evaluate our models. We evaluate a series of models based on classical text-based features to more contextual Transformer based models and observe that a model pipeline based on BERT and ALBERT for the two stages respectively yields the best results.- Anthology ID:
- 2020.nlp4if-1.1
- Volume:
- Proceedings of the 3rd NLP4IF Workshop on NLP for Internet Freedom: Censorship, Disinformation, and Propaganda
- Month:
- December
- Year:
- 2020
- Address:
- Barcelona, Spain (Online)
- Venue:
- NLP4IF
- SIG:
- Publisher:
- International Committee on Computational Linguistics (ICCL)
- Note:
- Pages:
- 1–10
- Language:
- URL:
- https://aclanthology.org/2020.nlp4if-1.1
- DOI:
- Cite (ACL):
- Rutvik Vijjali, Prathyush Potluri, Siddharth Kumar, and Sundeep Teki. 2020. Two Stage Transformer Model for COVID-19 Fake News Detection and Fact Checking. In Proceedings of the 3rd NLP4IF Workshop on NLP for Internet Freedom: Censorship, Disinformation, and Propaganda, pages 1–10, Barcelona, Spain (Online). International Committee on Computational Linguistics (ICCL).
- Cite (Informal):
- Two Stage Transformer Model for COVID-19 Fake News Detection and Fact Checking (Vijjali et al., NLP4IF 2020)
- PDF:
- https://preview.aclanthology.org/ingestion-script-update/2020.nlp4if-1.1.pdf
- Code
- rutvikvijjali/COVID-19-Claims-Dataset