Katarina Laken

2026

The proliferation of conspiracy theories and hateful messages on social media poses significant challenges for content moderation and public discourse. Despite their societal impact, existing datasets for automated conspiracy detection remain limited in scope and language coverage. We present a multilingual dataset of conspiracy content on Telegram comprising 5750 messages across English, Dutch, Italian, Spanish and Portuguese from 87 channels documented as disseminating conspiracist and extremist content. Domain experts annotated messages for conspiracist tone, population replacement conspiracy theories, vaccine conspiracies, and hate speech. We extensively report on difficulties and caveats when creating and annotating this type of dataset. We establish classification baselines by evaluating six models in zero-shot fashion and fine-tuning three encoder models, achieving F1 scores up to 0.800 for conspiracist tone, 0.846 for PRCT, 0.843 for vaccine-related conspiracy theories, and 0.734 for hate speech. Inter-annotator agreement was moderate, consistent with the complexity documented in similar annotation tasks.

2025

pdf bib abs

Multilingual Analysis of Narrative Properties in Conspiracist vs Mainstream Telegram Channels
Katarina Laken | Matteo Melis | Sara Tonelli | Marcos Garcia
Proceedings of the The 9th Workshop on Online Abuse and Harms (WOAH)

Conspiracist narratives posit an omnipotent, evil group causing harm throughout domains. However, modern-day online conspiracism is often more erratic, consisting of loosely connected posts displaying a general anti-establishment attitude pervaded by negative emotions. We gather a dataset of 300 conspiracist and mainstream, Telegram channels in Italian and English and use the automatic extraction of entities and emotion detection to compare structural characteristics of both types of channels. We create a co-occurrence network of entities to analyze how the different types of channels introduce and use them across posts and topics. We find that conspiracist channels are characterized by anger. Moreover, co-occurrence networks of entities appearing in conspiracist channels are more dense. We theorize that this reflects a narrative structure where all actants are pushed into a single domain. Conspiracist channels disproportionately associate the most central group of entities with anger and fear. We do not find evidence that entities in conspiracist narratives occur across more topics. This could indicate an erratic type of online conspiracism where everything can be connected to everything and that is characterized by a high number of entities and high levels of anger.

2024

pdf bib

Decoding Sentiments about Migration in Portuguese Political Manifestos (2011, 2015, 2019)
Erik Bran Marino | Renata Vieira | Jesus Manuel Benitez Baleato | Ana Sofia Ribeiro | Katarina Laken
Proceedings of the 16th International Conference on Computational Processing of Portuguese - Vol. 2

pdf bib abs

Fralak at SemEval-2024 Task 4: combining RNN-generated hierarchy paths with simple neural nets for hierarchical multilabel text classification in a multilingual zero-shot setting
Katarina Laken
Proceedings of the 18th International Workshop on Semantic Evaluation (SemEval-2024)

This paper describes the submission of team fralak for subtask 1 of task 4 of the Semeval-2024 shared task: ‘Multilingual detection of persuasion techniques in memes’. The first subtask included only the textual content of the memes. We restructured the labels into strings that showed the full path through the hierarchy. The system includes an RNN module that is trained to generate these strings. This module was then incorporated in an ensemble model with 2 more models consisting of basic fully connected networks. Although our model did not perform particularly well on the English only setting, we found that it generalized better to other languages in a zero-shot context than most other models. Some additional experiments were performed to explain this. Findings suggest that the RNN generating the restructured labels generalized well across languages, but preprocessing did not seem to play a role. We conclude by giving suggestions for future improvements of our core idea.

Co-authors

Davide Bassi 1

Søren Kirkegaard Fomsgaard 1

Michele Joshua Maggini 1

Matteo Melis 1

Paloma Piot 1

Ana Sofia Ribeiro 1

Venues

Fix author