Challenging the Evaluator: LLM Sycophancy Under User Rebuttal

Sung Won Kim; Daniel Khashabi

doi:10.18653/v1/2025.findings-emnlp.1222

Challenging the Evaluator: LLM Sycophancy Under User Rebuttal

Abstract

Large Language Models (LLMs) often exhibit sycophancy, distorting responses to align with user beliefs, notably by readily agreeing with user counterarguments. Paradoxically, LLMs are increasingly adopted as successful evaluative agents for tasks such as grading and adjudicating claims. This research investigates that tension: why do LLMs show sycophancy when challenged in subsequent conversational turns, yet perform well when evaluating conflicting arguments presented simultaneously? We empirically tested these contrasting scenarios by varying key interaction patterns. We find that state-of-the-art models: (1) are more likely to endorse a user’s counterargument when framed as a follow-up from a user, rather than when both responses are presented simultaneously for evaluation; (2) show increased susceptibility to persuasion when the user’s rebuttal includes detailed reasoning, even when the conclusion of the reasoning is incorrect; and (3) are more readily swayed by casually phrased feedback than by formal critiques, even when the casual input lacks justification. Our results highlight the risk of relying on LLMs for judgment tasks without accounting for conversational framing.

Anthology ID:: 2025.findings-emnlp.1222
Volume:: Findings of the Association for Computational Linguistics: EMNLP 2025
Month:: November
Year:: 2025
Address:: Suzhou, China
Editors:: Christos Christodoulopoulos, Tanmoy Chakraborty, Carolyn Rose, Violet Peng
Venue:: Findings
SIG:
Publisher:: Association for Computational Linguistics
Note:
Pages:: 22461–22478
Language:
URL:: https://preview.aclanthology.org/author-page-yu-wang-polytechnic/2025.findings-emnlp.1222/
DOI:: 10.18653/v1/2025.findings-emnlp.1222
Bibkey:
Cite (ACL):: Sung Won Kim and Daniel Khashabi. 2025. Challenging the Evaluator: LLM Sycophancy Under User Rebuttal. In Findings of the Association for Computational Linguistics: EMNLP 2025, pages 22461–22478, Suzhou, China. Association for Computational Linguistics.
Cite (Informal):: Challenging the Evaluator: LLM Sycophancy Under User Rebuttal (Kim & Khashabi, Findings 2025)
Copy Citation:
PDF:: https://preview.aclanthology.org/author-page-yu-wang-polytechnic/2025.findings-emnlp.1222.pdf
Checklist:: 2025.findings-emnlp.1222.checklist.pdf

PDF Cite Search Checklist Fix data