Avneet Kaur


Fixing paper assignments

  1. Please select all papers that belong to the same person.
  2. Indicate below which author they should be assigned to.
Provide a valid ORCID iD here. This will be used to match future papers to this author.
Provide the name of the school or the university where the author has received or will receive their highest degree (e.g., Ph.D. institution for researchers, or current affiliation for students). This will be used to form the new author page ID, if needed.

TODO: "submit" and "cancel" buttons here


2025

pdf bib
Echoes of Agreement: Argument Driven Sycophancy in Large Language models
Avneet Kaur
Findings of the Association for Computational Linguistics: EMNLP 2025

Existing evaluation of political biases in Large Language Models (LLMs) outline the high sensitivity to prompt formulation. Furthermore, Large Language Models are known to exhibit sycophancy, a tendency to align their outputs with a user’s stated belief, which is often attributed to human feedback during fine-tuning. However, such bias in the presence of explicit argumentation within a prompt remains underexplored. This paper investigates how argumentative prompts induce sycophantic behaviour in LLMs in a political context. Through a series of experiments, we demonstrate that models consistently alter their responses to mirror the stance present expressed by the user. This sycophantic behaviour is observed in both single and multi-turn interactions, and its intensity correlates with argument strength. Our findings establish a link between user stance and model sycophancy, revealing a critical vulnerability that impacts model reliability. Thus has significant implications for models being deployed in real-world settings and calls for developing robust evaluations and mitigations against manipulative or biased interactions.
Search
Co-authors
    Venues
    Fix data