RoleCDE: Benchmarking and Mitigating Role–Alignment Trade-offs in Role-Playing Agents

Huayi Lai; Shichao Song; Simin Niu; Hanyu Wang; Jiawei Yang; Zhouxing Wang; Zhiqiang Yin; Xun Liang

RoleCDE: Benchmarking and Mitigating Role–Alignment Trade-offs in Role-Playing Agents

Huayi Lai, Shichao Song, Simin Niu, Hanyu Wang, Jiawei Yang, Zhouxing Wang, Zhiqiang Yin, Xun Liang

Abstract

Role-playing agents(RPAs) are widely used to steer large language models(LLMs) toward role-consistent behavior, yet existing benchmarks mainly evaluate surface-level fidelity and offer limited insight into decision making under role–alignment value conflicts. To address this gap, we introduce RoleCDE, the first benchmark designed to evaluate RPAs under structured conflicts between role-specific values and alignment-oriented constraints. RoleCDE formulates role-aware decision making as cognitive dilemma scenarios, jointly evaluating role–scenario grounding, value conflict resolution, and decision tendencies. The benchmark is constructed at scale, covering approximately 8k diverse role profiles and scenarios and nearly 240k dilemma instances across three difficulty levels and eight role categories. Evaluation of several mainstream LLMs reveals a "Role Value Decoupling" phenomenon, where agents systematically default to alignment- and morality-consistent decisions rather than role-specific values when the two conflict, even under explicit role conditioning. This behavior is largely invariant to dilemma difficulty but varies substantially across role categories. We further show that RoleCDE-based fine-tuning effectively mitigates this decoupling by improving value trade-off reasoning, while preserving general role-playing fidelity and general reasoning performance. Code is available at: https://github.com/rabbitrose/RoleCDE.

Anthology ID:: 2026.findings-acl.106
Volume:: Findings of the Association for Computational Linguistics: ACL 2026
Month:: July
Year:: 2026
Address:: San Diego, California, United States
Editors:: Maria Liakata, Viviane P. Moreira, Jiajun Zhang, David Jurgens
Venue:: Findings
SIG:
Publisher:: Association for Computational Linguistics
Note:
Pages:: 2226–2248
Language:
URL:: https://preview.aclanthology.org/ingest-acl/2026.findings-acl.106/
DOI:
Bibkey:
Cite (ACL):: Huayi Lai, Shichao Song, Simin Niu, Hanyu Wang, Jiawei Yang, Zhouxing Wang, Zhiqiang Yin, and Xun Liang. 2026. RoleCDE: Benchmarking and Mitigating Role–Alignment Trade-offs in Role-Playing Agents. In Findings of the Association for Computational Linguistics: ACL 2026, pages 2226–2248, San Diego, California, United States. Association for Computational Linguistics.
Cite (Informal):: RoleCDE: Benchmarking and Mitigating Role–Alignment Trade-offs in Role-Playing Agents (Lai et al., Findings 2026)
Copy Citation:
PDF:: https://preview.aclanthology.org/ingest-acl/2026.findings-acl.106.pdf
Checklist:: 2026.findings-acl.106.checklist.pdf

PDF Cite Search Checklist Fix data