FACTOID: A New Dataset for Identifying Misinformation Spreaders and Political Bias
Flora Sakketou, Joan Plepi, Riccardo Cervero, Henri Jacques Geiss, Paolo Rosso, Lucie Flek
Abstract
Proactively identifying misinformation spreaders is an important step towards mitigating the impact of fake news on our society. In this paper, we introduce a new contemporary Reddit dataset for fake news spreader analysis, called FACTOID, monitoring political discussions on Reddit since the beginning of 2020. The dataset contains over 4K users with 3.4M Reddit posts, and includes, beyond the users’ binary labels, also their fine-grained credibility level (very low to very high) and their political bias strength (extreme right to extreme left). As far as we are aware, this is the first fake news spreader dataset that simultaneously captures both the long-term context of users’ historical posts and the interactions between them. To create the first benchmark on our data, we provide methods for identifying misinformation spreaders by utilizing the social connections between the users along with their psycho-linguistic features. We show that the users’ social interactions can, on their own, indicate misinformation spreading, while the psycho-linguistic features are mostly informative in non-neural classification settings. In a qualitative analysis we observe that detecting affective mental processes correlates negatively with right-biased users, and that the openness to experience factor is lower for those who spread fake news.- Anthology ID:
- 2022.lrec-1.345
- Volume:
- Proceedings of the Thirteenth Language Resources and Evaluation Conference
- Month:
- June
- Year:
- 2022
- Address:
- Marseille, France
- Editors:
- Nicoletta Calzolari, Frédéric Béchet, Philippe Blache, Khalid Choukri, Christopher Cieri, Thierry Declerck, Sara Goggi, Hitoshi Isahara, Bente Maegaard, Joseph Mariani, Hélène Mazo, Jan Odijk, Stelios Piperidis
- Venue:
- LREC
- SIG:
- Publisher:
- European Language Resources Association
- Note:
- Pages:
- 3231–3241
- Language:
- URL:
- https://aclanthology.org/2022.lrec-1.345
- DOI:
- Cite (ACL):
- Flora Sakketou, Joan Plepi, Riccardo Cervero, Henri Jacques Geiss, Paolo Rosso, and Lucie Flek. 2022. FACTOID: A New Dataset for Identifying Misinformation Spreaders and Political Bias. In Proceedings of the Thirteenth Language Resources and Evaluation Conference, pages 3231–3241, Marseille, France. European Language Resources Association.
- Cite (Informal):
- FACTOID: A New Dataset for Identifying Misinformation Spreaders and Political Bias (Sakketou et al., LREC 2022)
- PDF:
- https://preview.aclanthology.org/naacl24-info/2022.lrec-1.345.pdf
- Code
- caisa-lab/factoid-dataset
- Data
- RealNews