The Elephant in the Room: Exploring the Role of Neutral Words in Language Model Group-Agnostic Debiasing

Xinwei Guo, Jiashi Gao, Junlei Zhou, Jiaxin Zhang, Guanhua Chen, Xiangyu Zhao, Quanying Liu, Haiyan Wu, Xin Yao, Xuetao Wei


Abstract
Large Language Models (LLMs) are increasingly integrated into our daily lives, raising significant ethical concerns, especially about perpetuating stereotypes.While group-specific debiasing methods have made progress, they often fail to address multiple biases simultaneously. In contrast, group-agnostic debiasing has the potential to mitigate a variety of biases at once, but remains underexplored.In this work, we investigate the role of neutral words—the group-agnostic component—in enhancing the group-agnostic debiasing process. We first reveal that neutral words are essential for preserving semantic modeling, and we propose 𝜖-DPCE, a method that incorporates a neutral word semantics-based loss function to effectively alleviate the deterioration of the Language Modeling Score (LMS) during the debiasing process. Furthermore, by introducing the SCM-Projection method, we demonstrate that SCM-based debiasing eliminates stereotypes by indirectly disrupting the association between attribute and neutral words in the Stereotype Content Model (SCM) space. Our experiments show that neutral words, which often embed multi-group stereotypical objects, play a key role in contributing to the group-agnostic nature of SCM-based debiasing.
Anthology ID:
2025.findings-acl.1044
Volume:
Findings of the Association for Computational Linguistics: ACL 2025
Month:
July
Year:
2025
Address:
Vienna, Austria
Editors:
Wanxiang Che, Joyce Nabende, Ekaterina Shutova, Mohammad Taher Pilehvar
Venues:
Findings | WS
SIG:
Publisher:
Association for Computational Linguistics
Note:
Pages:
20360–20371
Language:
URL:
https://preview.aclanthology.org/ingestion-acl-25/2025.findings-acl.1044/
DOI:
Bibkey:
Cite (ACL):
Xinwei Guo, Jiashi Gao, Junlei Zhou, Jiaxin Zhang, Guanhua Chen, Xiangyu Zhao, Quanying Liu, Haiyan Wu, Xin Yao, and Xuetao Wei. 2025. The Elephant in the Room: Exploring the Role of Neutral Words in Language Model Group-Agnostic Debiasing. In Findings of the Association for Computational Linguistics: ACL 2025, pages 20360–20371, Vienna, Austria. Association for Computational Linguistics.
Cite (Informal):
The Elephant in the Room: Exploring the Role of Neutral Words in Language Model Group-Agnostic Debiasing (Guo et al., Findings 2025)
Copy Citation:
PDF:
https://preview.aclanthology.org/ingestion-acl-25/2025.findings-acl.1044.pdf