Unmasking Style Sensitivity: A Causal Analysis of Bias Evaluation Instability in Large Language Models

Jiaxu Zhao; Meng Fang; Kun Zhang; Mykola Pechenizkiy

Unmasking Style Sensitivity: A Causal Analysis of Bias Evaluation Instability in Large Language Models

Jiaxu Zhao, Meng Fang, Kun Zhang, Mykola Pechenizkiy

Abstract

Natural language processing applications are increasingly prevalent, but social biases in their outputs remain a critical challenge. While various bias evaluation methods have been proposed, these assessments show unexpected instability when input texts undergo minor stylistic changes. This paper conducts a comprehensive analysis of how different style transformations impact bias evaluation results across multiple language models and bias types using causal inference techniques. Our findings reveal that formality transformations significantly affect bias scores, with informal style showing substantial bias reductions (up to 8.33% in LLaMA-2-13B). We identify appearance bias, sexual orientation bias, and religious bias as most susceptible to style changes, with variations exceeding 20%. Larger models demonstrate greater sensitivity to stylistic variations, with bias measurements fluctuating up to 3.1% more than in smaller models. These results highlight critical limitations in current bias evaluation methods and emphasize the need for reliable and fair assessments of language models.

Anthology ID:: 2025.acl-long.796
Volume:: Proceedings of the 63rd Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers)
Month:: July
Year:: 2025
Address:: Vienna, Austria
Editors:: Wanxiang Che, Joyce Nabende, Ekaterina Shutova, Mohammad Taher Pilehvar
Venue:: ACL
SIG:
Publisher:: Association for Computational Linguistics
Note:
Pages:: 16314–16338
Language:
URL:: https://preview.aclanthology.org/landing_page/2025.acl-long.796/
DOI:
Bibkey:
Cite (ACL):: Jiaxu Zhao, Meng Fang, Kun Zhang, and Mykola Pechenizkiy. 2025. Unmasking Style Sensitivity: A Causal Analysis of Bias Evaluation Instability in Large Language Models. In Proceedings of the 63rd Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), pages 16314–16338, Vienna, Austria. Association for Computational Linguistics.
Cite (Informal):: Unmasking Style Sensitivity: A Causal Analysis of Bias Evaluation Instability in Large Language Models (Zhao et al., ACL 2025)
Copy Citation:
PDF:: https://preview.aclanthology.org/landing_page/2025.acl-long.796.pdf

PDF Cite Search Fix data