MIPD: Exploring Manipulation and Intention In a Novel Corpus of Polish Disinformation

Arkadiusz Modzelewski, Giovanni Da San Martino, Pavel Savov, Magdalena Anna Wilczyńska, Adam Wierzbicki


Abstract
This study presents a novel corpus of 15,356 Polish web articles, including articles identified as containing disinformation. Our dataset enables a multifaceted understanding of disinformation. We present a distinctive multilayered methodology for annotating disinformation in texts. What sets our corpus apart is its focus on uncovering hidden intent and manipulation in disinformative content. A team of experts annotated each article with multiple labels indicating both disinformation creators’ intents and the manipulation techniques employed. Additionally, we set new baselines for binary disinformation detection and two multiclass multilabel classification tasks: manipulation techniques and intention types classification.
Anthology ID:
2024.emnlp-main.1103
Volume:
Proceedings of the 2024 Conference on Empirical Methods in Natural Language Processing
Month:
November
Year:
2024
Address:
Miami, Florida, USA
Editors:
Yaser Al-Onaizan, Mohit Bansal, Yun-Nung Chen
Venue:
EMNLP
SIG:
Publisher:
Association for Computational Linguistics
Note:
Pages:
19769–19785
Language:
URL:
https://preview.aclanthology.org/ingest_wac_2008/2024.emnlp-main.1103/
DOI:
10.18653/v1/2024.emnlp-main.1103
Bibkey:
Cite (ACL):
Arkadiusz Modzelewski, Giovanni Da San Martino, Pavel Savov, Magdalena Anna Wilczyńska, and Adam Wierzbicki. 2024. MIPD: Exploring Manipulation and Intention In a Novel Corpus of Polish Disinformation. In Proceedings of the 2024 Conference on Empirical Methods in Natural Language Processing, pages 19769–19785, Miami, Florida, USA. Association for Computational Linguistics.
Cite (Informal):
MIPD: Exploring Manipulation and Intention In a Novel Corpus of Polish Disinformation (Modzelewski et al., EMNLP 2024)
Copy Citation:
PDF:
https://preview.aclanthology.org/ingest_wac_2008/2024.emnlp-main.1103.pdf
Software:
 2024.emnlp-main.1103.software.zip
Data:
 2024.emnlp-main.1103.data.zip