Measuring Sycophancy of Language Models in Multi-turn Dialogues

Jiseung Hong; Grace Byun; Seungone Kim; Kai Shu

doi:10.18653/v1/2025.findings-emnlp.121

Measuring Sycophancy of Language Models in Multi-turn Dialogues

Jiseung Hong, Grace Byun, Seungone Kim, Kai Shu

Abstract

Large Language Models (LLMs) are expected to provide helpful and harmless responses, yet they often exhibit sycophancy—conforming to user beliefs regardless of factual accuracy or ethical soundness. Prior research on sycophancy has primarily focused on single-turn factual correctness, overlooking the dynamics of real-world interactions. In this work, we introduce SYCON Bench (SYcophantic CONformity benchmark), a novel evaluation suite that assesses sycophantic behavior in multi-turn, free-form conversational settings. Our benchmark measures how quickly a model conforms to the user (Turn of Flip) and how frequently it shifts its stance under sustained user pressure (Number of Flip). Applying SYCON Bench to 17 LLMs across three real-world scenarios, we find that sycophancy remains a prevalent failure mode. Our analysis shows that alignment tuning amplifies sycophantic behavior, whereas model scaling and reasoning optimization strengthen the model’s ability to resist undesirable user views. Reasoning models generally outperform instruction-tuned models but often fail when they over-index on logical exposition instead of directly addressing the user’s underlying beliefs. Finally, we evaluate four additional prompting strategies and demonstrate that adopting a third-person perspective reduces sycophancy by up to 63.8% in debate scenario.

Anthology ID:: 2025.findings-emnlp.121
Volume:: Findings of the Association for Computational Linguistics: EMNLP 2025
Month:: November
Year:: 2025
Address:: Suzhou, China
Editors:: Christos Christodoulopoulos, Tanmoy Chakraborty, Carolyn Rose, Violet Peng
Venue:: Findings
SIG:
Publisher:: Association for Computational Linguistics
Note:
Pages:: 2239–2259
Language:
URL:: https://preview.aclanthology.org/author-page-yu-wang-polytechnic/2025.findings-emnlp.121/
DOI:: 10.18653/v1/2025.findings-emnlp.121
Bibkey:
Cite (ACL):: Jiseung Hong, Grace Byun, Seungone Kim, and Kai Shu. 2025. Measuring Sycophancy of Language Models in Multi-turn Dialogues. In Findings of the Association for Computational Linguistics: EMNLP 2025, pages 2239–2259, Suzhou, China. Association for Computational Linguistics.
Cite (Informal):: Measuring Sycophancy of Language Models in Multi-turn Dialogues (Hong et al., Findings 2025)
Copy Citation:
PDF:: https://preview.aclanthology.org/author-page-yu-wang-polytechnic/2025.findings-emnlp.121.pdf
Checklist:: 2025.findings-emnlp.121.checklist.pdf

PDF Cite Search Checklist Fix data