Francesco Pierri


2026

The interaction between fringe subcultures and mainstream online communities poses significant challenges for understanding discourse on social media.In this work, we investigate whether users active in conspiracy-focused communities exhibit detectable linguistic signatures when participating in general-interest spaces, such as news, humor, or hobbyist forums.We analyze a large-scale longitudinal dataset of over 500 million comments spanning 10 years of Reddit activity, examining the communication patterns of these users across diverse social contexts independent of the topics they discuss.We show that these users exhibit distinctive linguistic patterns that enable machine learning models to reliably distinguish them from the general population within individual communities (averaging 87% accuracy across more than 20 binary classification tasks).Crucially, no single aggregate model captures these patterns across communities, as community-specific models outperform global classifiers by up to 17 percentage points.This result suggests that while these users are distinct, their linguistic expression is dynamic and highly responsive to the social norms of the environment they inhabit. Our findings suggest the need for tailored interventions in online spaces, as linguistic signals associated with conspiracy and fringe subcultures vary across communities and cannot be effectively addressed by uniform detection or moderation strategies.
Large language models (LLMs) are increasingly deployed in user-facing applications, raising concerns that they may reflect and amplify social biases. We investigate social identity biases in Chinese LLMs using Mandarin-specific prompts across ten representative models. Our evaluation compares ingroup (“We”) and outgroup (“They”) framings across 240 social groups salient in the Chinese context, using a two-tiered measurement framework that assesses both sentiment and toxicity. The prompt design explicitly accounts for linguistic properties of Mandarin, including the distinction between the default plural pronoun 他们 and the explicitly feminine plural 她们, enabling a controlled comparison of social identity framing effects. Across models, we observe systematic ingroup–outgroup asymmetries, although their expression differs across measurement dimensions. In particular, instruction tuning often reduces sentiment asymmetries, while toxicity gaps remain more persistent. Moreover, the feminine-marked plural 她们 is associated with higher toxicity than the default plural in several models. Our study introduces a language-aware evaluation framework for Chinese LLMs and shows that (i) social identity biases previously documented in English also manifest in Chinese and that (ii) Mandarin-specific linguistic structure can reveal bias patterns that are not directly observable in English-only settings.

2025

This paper investigates the effectiveness of TikTok’s enforcement mechanisms for limiting the exposure of harmful content to youth accounts. We collect over 7000 videos, classify them as harmful vs not-harmful, and then simulate interactions using age-specific sockpuppet accounts through both passive and active engagement strategies. We also evaluate the performance of large language (LLMs) and vision-language models (VLMs) in detecting harmful content, identifying key challenges in precision and scalability. Preliminary results show minimal differences in content exposure between adult and youth accounts, raising concerns about the platform’s age-based moderation. These findings suggest that the platform needs to strengthen youth safety measures and improve transparency in content moderation.
Creating globally inclusive AI systems demands datasets reflecting diverse social norms. Iran, with its unique cultural blend, offers an ideal case study, with Farsi adding linguistic complexity. In this work, we introduce the Iranian Social Norms (ISN) dataset, a novel collection of 1,699 Iranian social norms, including environments, demographic features, and scope annotation, alongside English translations. Our evaluation of 6 Large Language Models (LLMs) in classifying Iranian social norms, using a variety of prompts, uncovered critical insights into the impact of geographic and linguistic context. Results revealed a substantial performance gap in LLMs’ comprehension of Iranian norms. Notably, while the geographic context in English prompts enhanced the performance, this effect was absent in Farsi, pointing to nuanced linguistic challenges. Particularly, performance was significantly worse for Iran-specific norms, emphasizing the importance of culturally tailored datasets. As the first Farsi dataset for social norm classification, ISN will facilitate crucial cross-cultural analyses, shedding light on how values differ across contexts and cultures.
TikTok has skyrocketed in popularity over recent years, especially among younger audiences. However, there are public concerns about the potential of this platform to promote and amplify harmful content. This study presents the first systematic analysis of conspiracy theories on TikTok. By leveraging the official TikTok Research API we collect a longitudinal dataset of 1.5M videos shared in the U.S. over three years. We estimate a lower bound on the prevalence of conspiratorial videos (up to 1000 new videos per month) and evaluate the effects of TikTok’s Creativity Program for monetization, observing an overall increase in video duration regardless of content. Lastly, we evaluate the capabilities of state-of-the-art open-weight Large Language Models to identify conspiracy theories from audio transcriptions of videos. While these models achieve high precision in detecting harmful content (up to 96%), their overall performance remains comparable to fine-tuned traditional models such as RoBERTa. Our findings suggest that Large Language Models can serve as an effective tool for supporting content moderation strategies aimed at reducing the spread of harmful content on TikTok.