Wenchao Dong

2026

I Am Not Them: Persistent Outgroup Bias in Large Language Models Arising from Social Identity Persona Setting
Wenchao Dong | Assem Zhunis | Dongyoung Jeong | Hyojin Chin | Jiyoung Han | Meeyoung Cha
Proceedings of the Fifteenth Language Resources and Evaluation Conference

This research examines how large language models internalize social identities assigned through targeted prompts. Guided by social identity theory, we investigate whether and how these identity assignments cause AI systems to differentiate between "we" (the ingroup) and "they" (the outgroup). We demonstrate that self-categorization of social identity leads to both ingroup favoritism and outgroup bias, with the latter manifesting as strongly as the former. This finding is significant given the fundamental role of outgroup bias in driving intergroup prejudice and discrimination as documented in social psychology. We further propose a strategic intervention to mitigate such bias by guiding language models to adopt the identity of the initially disfavored group. This method, validated across both political and gender domains, exposes a critical dual function of group alignment: adopting one social identity inherently alters the model’s stance toward outgroups, effectively neutralizing pre-existing biases. Our work shows that understanding human-like AI behaviors is a critical prerequisite to building more balanced and socially responsible technology.

2025

pdf bib abs

Humans have an inherent need for community belongingness. This paper investigates this fundamental social motivation by compiling a large collection of parallel datasets comprising over 7 million posts and comments from Reddit and 200,000 posts and comments from Dread, a dark web discussion forum, covering similar topics. Grounded in five theoretical aspects of the Sense of Community framework, our analysis indicates that users on Dread exhibit a stronger sense of community membership. Our data analysis reveals striking similarities in post content across both platforms, despite the dark web’s restricted accessibility. However, these communities differ significantly in community-level closeness, including member interactions and greeting patterns that influence user retention and dynamics. We publicly release the parallel community datasets for other researchers to examine key differences and explore potential directions for further study.

Co-authors

Venues

Findings1
LREC1

Fix author