Kinga Skorupska
2026
DiNO: Disinformation Narrative Observer
Witold Sosnowski | Arkadiusz Modzelewski | Kinga Skorupska | Adam Wierzbicki
Proceedings of the 64th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers)
Witold Sosnowski | Arkadiusz Modzelewski | Kinga Skorupska | Adam Wierzbicki
Proceedings of the 64th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers)
Disinformation is an escalating global threat, making it essential to understand its content, dissemination, and evolution. To confront this challenge, researchers have begun grouping related false claims into broader disinformation narratives, which can be tracked across cultures, time periods, and media sources. Analyzing these narratives provides critical insights for developing more effective countermeasures. To this end, we introduce DiNO: Disinformation Narrative Observer, a novel method designed to extract disinformation narratives from news articles. We applied DiNO to news articles on the Ukraine War, COVID-19 and Migration, sourced from disinformation-prone outlets as well as a reputable source. We evaluated the narratives extracted by DiNO by measuring how well their topics and stances aligned with a recognized disinformation narratives dataset. DiNO outperforms competitive narrative mining approaches, including Relatio and CaNarEx, achieving a 41%–44% improvement in topical alignment and a 30%–41% improvment in stance alignment.
2025
DiNaM: Disinformation Narrative Mining with Large Language Models
Witold Sosnowski | Arkadiusz Modzelewski | Kinga Skorupska | Adam Wierzbicki
Proceedings of the 2025 Conference on Empirical Methods in Natural Language Processing
Witold Sosnowski | Arkadiusz Modzelewski | Kinga Skorupska | Adam Wierzbicki
Proceedings of the 2025 Conference on Empirical Methods in Natural Language Processing
Disinformation poses a significant threat to democratic societies, public health, and national security. To address this challenge, fact-checking experts analyze and track disinformation narratives. However, the process of manually identifying these narratives is highly time-consuming and resource-intensive. In this article, we introduce DiNaM, the first algorithm and structured framework specifically designed for mining disinformation narratives. DiNaM uses a multi-step approach to uncover disinformation narratives. It first leverages Large Language Models (LLMs) to detect false information, then applies clustering techniques to identify underlying disinformation narratives. We evaluated DiNaM’s performance using ground-truth disinformation narratives from the EUDisinfoTest dataset. The evaluation employed the Weighted Chamfer Distance (WCD), which measures the similarity between two sets of embeddings: the ground truth and the predicted disinformation narratives. DiNaM achieved a state-of-the-art WCD score of 0.73, outperforming general-purpose narrative mining methods by a notable margin of 16.4–24.7%. We are releasing DiNaM’s codebase and the dataset to the public.
2024
EU DisinfoTest: a Benchmark for Evaluating Language Models’ Ability to Detect Disinformation Narratives
Witold Sosnowski | Arkadiusz Modzelewski | Kinga Skorupska | Jahna Otterbacher | Adam Wierzbicki
Findings of the Association for Computational Linguistics: EMNLP 2024
Witold Sosnowski | Arkadiusz Modzelewski | Kinga Skorupska | Jahna Otterbacher | Adam Wierzbicki
Findings of the Association for Computational Linguistics: EMNLP 2024
As narratives shape public opinion and influence societal actions, distinguishing between truthful and misleading narratives has become a significant challenge. To address this, we introduce the EU DisinfoTest, a novel benchmark designed to evaluate the efficacy of Language Models in identifying disinformation narratives. Developed through a Human-in-the-Loop methodology and grounded in research from EU DisinfoLab, the EU DisinfoTest comprises more than 1,300 narratives. Our benchmark includes persuasive elements under Logos, Pathos, and Ethos rhetorical dimensions. We assessed state-of-the-art LLMs, including the newly released GPT-4o, on their capability to perform zero-shot classification of disinformation narratives versus credible narratives. Our findings reveal that LLMs tend to regard narratives with authoritative appeals as trustworthy, while those with emotional appeals are frequently incorrectly classified as disinformative. These findings highlight the challenges LLMs face in nuanced content interpretation and suggest the need for tailored adjustments in LLM training to better handle diverse narrative structures.