Geng Liu

2026

Probing Social Identity Bias in Chinese LLMs with Gendered Pronouns and Social Groups
Geng Liu | Li Feng | Junjie Mu | Mengxiao Zhu | Francesco Pierri
Findings of the Association for Computational Linguistics: ACL 2026

Large language models (LLMs) are increasingly deployed in user-facing applications, raising concerns that they may reflect and amplify social biases. We investigate social identity biases in Chinese LLMs using Mandarin-specific prompts across ten representative models. Our evaluation compares ingroup (“We”) and outgroup (“They”) framings across 240 social groups salient in the Chinese context, using a two-tiered measurement framework that assesses both sentiment and toxicity. The prompt design explicitly accounts for linguistic properties of Mandarin, including the distinction between the default plural pronoun 他们 and the explicitly feminine plural 她们, enabling a controlled comparison of social identity framing effects. Across models, we observe systematic ingroup–outgroup asymmetries, although their expression differs across measurement dimensions. In particular, instruction tuning often reduces sentiment asymmetries, while toxicity gaps remain more persistent. Moreover, the feminine-marked plural 她们 is associated with higher toxicity than the default plural in several models. Our study introduces a language-aware evaluation framework for Chinese LLMs and shows that (i) social identity biases previously documented in English also manifest in Chinese and that (ii) Mandarin-specific linguistic structure can reveal bias patterns that are not directly observable in English-only settings.

2025

pdf bib abs

Towards an Automated Framework to Audit Youth Safety on TikTok
Linda Xue | Francesco Corso | Nicolo Fontana | Geng Liu | Stefano Ceri | Francesco Pierri
Proceedings of the Fourth Workshop on Bridging Human-Computer Interaction and Natural Language Processing (HCI+NLP)

This paper investigates the effectiveness of TikTok’s enforcement mechanisms for limiting the exposure of harmful content to youth accounts. We collect over 7000 videos, classify them as harmful vs not-harmful, and then simulate interactions using age-specific sockpuppet accounts through both passive and active engagement strategies. We also evaluate the performance of large language (LLMs) and vision-language models (VLMs) in detecting harmful content, identifying key challenges in precision and scalability. Preliminary results show minimal differences in content exposure between adult and youth accounts, raising concerns about the platform’s age-based moderation. These findings suggest that the platform needs to strengthen youth safety measures and improve transparency in content moderation.

Co-authors

Junjie Mu 1

Linda Xue 1

Mengxiao Zhu (朱孟笑) 1

Venues

Fix author