A First Attempt at Unreliable News Detection in Swedish
Ricardo Muñoz Sánchez, Eric Johansson, Shakila Tayefeh, Shreyash Kad
Abstract
Throughout the COVID-19 pandemic, a parallel infodemic has also been going on such that the information has been spreading faster than the virus itself. During this time, every individual needs to access accurate news in order to take corresponding protective measures, regardless of their country of origin or the language they speak, as misinformation can cause significant loss to not only individuals but also society. In this paper we train several machine learning models (ranging from traditional machine learning to deep learning) to try to determine whether news articles come from either a reliable or an unreliable source, using just the body of the article. Moreover, we use a previously introduced corpus of news in Swedish related to the COVID-19 pandemic for the classification task. Given that our dataset is both unbalanced and small, we use subsampling and easy data augmentation (EDA) to try to solve these issues. In the end, we realize that, due to the small size of our dataset, using traditional machine learning along with data augmentation yields results that rival those of transformer models such as BERT.- Anthology ID:
- 2022.restup-1.1
- Volume:
- Proceedings of the Second International Workshop on Resources and Techniques for User Information in Abusive Language Analysis
- Month:
- June
- Year:
- 2022
- Address:
- Marseille, France
- Editors:
- Johanna Monti, Valerio Basile, Maria Pia Di Buono, Raffaele Manna, Antonio Pascucci, Sara Tonelli
- Venue:
- ResTUP
- SIG:
- Publisher:
- European Language Resources Association
- Note:
- Pages:
- 1–7
- Language:
- URL:
- https://aclanthology.org/2022.restup-1.1
- DOI:
- Cite (ACL):
- Ricardo Muñoz Sánchez, Eric Johansson, Shakila Tayefeh, and Shreyash Kad. 2022. A First Attempt at Unreliable News Detection in Swedish. In Proceedings of the Second International Workshop on Resources and Techniques for User Information in Abusive Language Analysis, pages 1–7, Marseille, France. European Language Resources Association.
- Cite (Informal):
- A First Attempt at Unreliable News Detection in Swedish (Muñoz Sánchez et al., ResTUP 2022)
- PDF:
- https://preview.aclanthology.org/nschneid-patch-1/2022.restup-1.1.pdf