Wenlong Shi
2026
PersonaArena: Dynamic Simulation for Evaluating and Enhancing Persona-Level Role-Playing in Large Language Models
Wenlong Shi | Jianxun Lian | Mingqi Wu | Haiming Qin | Mingyang Zhou | Xing Xie | Naipeng Chao | Hao Liao
Findings of the Association for Computational Linguistics: ACL 2026
Wenlong Shi | Jianxun Lian | Mingqi Wu | Haiming Qin | Mingyang Zhou | Xing Xie | Naipeng Chao | Hao Liao
Findings of the Association for Computational Linguistics: ACL 2026
Large language models (LLMs) increasingly serve as interactive social agents, yet their ability to maintain coherent and authentic persona-level role-playing remains limited, particularly in realistic social scenarios. Existing research predominantly focuses on character-level settings and relies on static evaluation formats, failing to capture the complexity of everyday social interactions. In this work, we present PersonaArena, a dynamic simulation framework for evaluating and improving persona-level role-playing in LLMs. PersonaArena leverages a large, filtered corpus of user-generated social content to construct a nuanced persona bank, and elicits multi-turn, context-rich interactions within simulated social environments. Our framework features a multi-agent debating judge for holistic and unbiased assessment. Through extensive experiments, we demonstrate that PersonaArena enables rigorous evaluation and enhancement of LLMs’ role-playing capabilities, advancing the development of more authentic and socially adept AI agents. Our codes and long appendix are available at https://anonymous.4open.science/r/PersonaArena-B323/.