I Know, but I Don’t Know! How Persona Conflict Undermines Instruction Adherence in Large Language Models

Seonmin Koo; Jinsung Kim; Heui-Seok Lim

I Know, but I Don’t Know! How Persona Conflict Undermines Instruction Adherence in Large Language Models

Abstract

Large Language Models (LLMs) are expected to generate appropriate responses while adhering to predefined prior constraints or knowledge, such as user personas, across various dialogue scenarios. However, real-world interactions frequently involve semantic conflicts between such prior information and actual user-provided inputs. Despite this, prior studies on persona-grounded dialogue—one of the representative tasks in personal preference modeling—have predominantly assumed idealized scenarios where persona and user utterances are fully aligned. To bridge this gap, we introduce and formalize the notion of persona conflict, wherein predefined personas contradict the personal information expressed by the user during interaction. We present a systematic verification framework to examine model behavior under such conflict scenarios. In detail, we propose a taxonomy that categorizes model behaviors into three distinct response types (adhering, sycophantic, and wavering) and develop a measurement schema grounded in this taxonomy. Our study provides a comprehensive analysis of the persona conflict phenomenon, identifying diverse key behavioral factors. Extensive experiments and in-depth analysis provide new insights into designing robust dialogue models capable of managing persona inconsistencies.

Anthology ID:: 2026.findings-eacl.24
Volume:: Findings of the Association for Computational Linguistics: EACL 2026
Month:: March
Year:: 2026
Address:: Rabat, Morocco
Editors:: Vera Demberg, Kentaro Inui, Lluís Marquez
Venue:: Findings
SIG:
Publisher:: Association for Computational Linguistics
Note:
Pages:: 465–489
Language:
URL:: https://preview.aclanthology.org/ingest-eacl/2026.findings-eacl.24/
DOI:
Bibkey:
Cite (ACL):: Seonmin Koo, Jinsung Kim, and Heuiseok Lim. 2026. I Know, but I Don’t Know! How Persona Conflict Undermines Instruction Adherence in Large Language Models. In Findings of the Association for Computational Linguistics: EACL 2026, pages 465–489, Rabat, Morocco. Association for Computational Linguistics.
Cite (Informal):: I Know, but I Don’t Know! How Persona Conflict Undermines Instruction Adherence in Large Language Models (Koo et al., Findings 2026)
Copy Citation:
PDF:: https://preview.aclanthology.org/ingest-eacl/2026.findings-eacl.24.pdf
Checklist:: 2026.findings-eacl.24.checklist.pdf

PDF Cite Search Checklist Fix data