Revealing and Mitigating the Challenge of Detecting Character Knowledge Errors in LLM Role-Playing

Wenyuan Zhang; Shuaiyi Nie; Jiawei Sheng; Zefeng Zhang; Xinghua Zhang; Yongquan He; Tingwen Liu (柳厅文)

Revealing and Mitigating the Challenge of Detecting Character Knowledge Errors in LLM Role-Playing

Wenyuan Zhang, Shuaiyi Nie, Jiawei Sheng, Zefeng Zhang, Xinghua Zhang, Yongquan He, Tingwen Liu

Abstract

Large language model (LLM) role-playing has gained widespread attention. Authentic character knowledge is crucial for constructing realistic LLM role-playing agents. However, existing works usually overlook the exploration of LLMs’ ability to detect characters’ known knowledge errors (KKE) and unknown knowledge errors (UKE) while playing roles, which would lead to low-quality automatic construction of character trainable corpus. In this paper, we propose RoleKE-Bench to evaluate LLMs’ ability to detect errors in KKE and UKE. The results indicate that even the latest LLMs struggle to detect these two types of errors effectively, especially when it comes to familiar knowledge. We experimented with various reasoning strategies and propose an agent-based reasoning method, Self-Recollection and Self-Doubt (S²RD), to explore further the potential for improving error detection capabilities.

Anthology ID:: 2025.emnlp-main.1689
Volume:: Proceedings of the 2025 Conference on Empirical Methods in Natural Language Processing
Month:: November
Year:: 2025
Address:: Suzhou, China
Editors:: Christos Christodoulopoulos, Tanmoy Chakraborty, Carolyn Rose, Violet Peng
Venue:: EMNLP
SIG:
Publisher:: Association for Computational Linguistics
Note:
Pages:: 33267–33290
Language:
URL:: https://preview.aclanthology.org/ingest-emnlp/2025.emnlp-main.1689/
DOI:
Bibkey:
Cite (ACL):: Wenyuan Zhang, Shuaiyi Nie, Jiawei Sheng, Zefeng Zhang, Xinghua Zhang, Yongquan He, and Tingwen Liu. 2025. Revealing and Mitigating the Challenge of Detecting Character Knowledge Errors in LLM Role-Playing. In Proceedings of the 2025 Conference on Empirical Methods in Natural Language Processing, pages 33267–33290, Suzhou, China. Association for Computational Linguistics.
Cite (Informal):: Revealing and Mitigating the Challenge of Detecting Character Knowledge Errors in LLM Role-Playing (Zhang et al., EMNLP 2025)
Copy Citation:
PDF:: https://preview.aclanthology.org/ingest-emnlp/2025.emnlp-main.1689.pdf
Checklist:: 2025.emnlp-main.1689.checklist.pdf

PDF Cite Search Checklist Fix data