Yuxing Chen
2026
Multi-Persona Thinking for Bias Mitigation in Large Language Models
Yuxing Chen | Guoqing Luo | Zijun Wu | Lili Mou
Findings of the Association for Computational Linguistics: ACL 2026
Yuxing Chen | Guoqing Luo | Zijun Wu | Lili Mou
Findings of the Association for Computational Linguistics: ACL 2026
Large Language Models (LLMs) exhibit social biases, which can lead to harmful stereotypes and unfair outcomes. We propose Multi-Persona Thinking (MPT), a simple inference-time framework that reduces social bias by encouraging reasoning from multiple perspectives. MPT guides the model to consider contrasting social identities, such as male and female, together with a neutral viewpoint. These viewpoints then interact through an iterative reasoning process to identify and correct biased judgments. This design transforms the potential weakness of persona assignment into a mechanism to mitigate bias. We evaluate MPT on two widely used bias benchmarks with both open-source and closed-source models. Our results show that MPT achieves a lower bias than the existing prompting-based methods while maintaining the core reasoning ability.
2020
Harnessing the linguistic signal to predict scalar inferences
Sebastian Schuster | Yuxing Chen | Judith Degen
Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics
Sebastian Schuster | Yuxing Chen | Judith Degen
Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics
Pragmatic inferences often subtly depend on the presence or absence of linguistic features. For example, the presence of a partitive construction (of the) increases the strength of a so-called scalar inference: listeners perceive the inference that Chris did not eat all of the cookies to be stronger after hearing “Chris ate some of the cookies” than after hearing the same utterance without a partitive, “Chris ate some cookies”. In this work, we explore to what extent neural network sentence encoders can learn to predict the strength of scalar inferences. We first show that an LSTM-based sentence encoder trained on an English dataset of human inference strength ratings is able to predict ratings with high accuracy (r = 0.78). We then probe the model’s behavior using manually constructed minimal sentence pairs and corpus data. We first that the model inferred previously established associations between linguistic features and inference strength, suggesting that the model learns to use linguistic features to predict pragmatic inferences.