Guess What I am Thinking: A Benchmark for Inner Thought Reasoning of Role-Playing Language Agents

Rui Xu; Mingyu Wang; Xintao Wang; Dakuan Lu; Xiaoyu Tan; Wei Chu; Xu Yinghui

doi:10.18653/v1/2025.findings-emnlp.819

Guess What I am Thinking: A Benchmark for Inner Thought Reasoning of Role-Playing Language Agents

Rui Xu, Mingyu Wang, Xintao Wang, Dakuan Lu, Xiaoyu Tan, Wei Chu, Xu Yinghui

Abstract

Recent advances in Large Language Model (LLM)-based Role-Playing Language Agents (RPLAs) have attracted broad attention in various applications. While chain-of-thought reasoning has shown importance in many tasks for LLMs, the internal thinking processes of RPLAs remain unexplored. Understanding characters’ inner thoughts is crucial for developing advanced RPLAs. In this paper, we introduce ROLETHINK, a novel benchmark constructed from literature for evaluating character thought generation. We propose the task of inner thought reasoning, constructing 6,058 data entries from 76 books, which includes two sets: the gold set that compares generated thoughts with original character monologues, and the silver set that uses expert-synthesized character analyses as references. To address this challenge, we propose MIRROR, a chain-of-thought approach that generates character thoughts by retrieving memories, predicting character reactions, and synthesizing motivations. Through extensive experiments, we demonstrate the importance of inner thought reasoning for RPLAs, and MIRROR consistently outperforms existing methods.

Anthology ID:: 2025.findings-emnlp.819
Volume:: Findings of the Association for Computational Linguistics: EMNLP 2025
Month:: November
Year:: 2025
Address:: Suzhou, China
Editors:: Christos Christodoulopoulos, Tanmoy Chakraborty, Carolyn Rose, Violet Peng
Venue:: Findings
SIG:
Publisher:: Association for Computational Linguistics
Note:
Pages:: 15148–15168
Language:
URL:: https://preview.aclanthology.org/author-page-yu-wang-polytechnic/2025.findings-emnlp.819/
DOI:: 10.18653/v1/2025.findings-emnlp.819
Bibkey:
Cite (ACL):: Rui Xu, Mingyu Wang, Xintao Wang, Dakuan Lu, Xiaoyu Tan, Wei Chu, and Xu Yinghui. 2025. Guess What I am Thinking: A Benchmark for Inner Thought Reasoning of Role-Playing Language Agents. In Findings of the Association for Computational Linguistics: EMNLP 2025, pages 15148–15168, Suzhou, China. Association for Computational Linguistics.
Cite (Informal):: Guess What I am Thinking: A Benchmark for Inner Thought Reasoning of Role-Playing Language Agents (Xu et al., Findings 2025)
Copy Citation:
PDF:: https://preview.aclanthology.org/author-page-yu-wang-polytechnic/2025.findings-emnlp.819.pdf
Checklist:: 2025.findings-emnlp.819.checklist.pdf

PDF Cite Search Checklist Fix data