@inproceedings{du-etal-2026-say,
title = "It{'}s Not What You Say, It{'}s How You Say It: Evaluating {LLM} Responses to Expressions of Belief",
author = "Du, Kevin and
K{\textbackslash}{''}umpel, Clara and
Wastl, Michelle and
Warstadt, Alex",
editor = "Liakata, Maria and
Moreira, Viviane P. and
Zhang, Jiajun and
Jurgens, David",
booktitle = "Proceedings of the 64th Annual Meeting of the {A}ssociation for {C}omputational {L}inguistics (Volume 1: Long Papers)",
month = jul,
year = "2026",
address = "San Diego, California, United States",
publisher = "Association for Computational Linguistics",
url = "https://preview.aclanthology.org/ingest-acl/2026.acl-long.142/",
pages = "3137--3151",
ISBN = "979-8-89176-390-6",
abstract = "Users frequently express their beliefs to large language models (LLMs). In some situations, the LLM should accept these contextual beliefs as true. In others, they should stick to their prior knowledge. Notably, users' expressions of belief (EoBs) can take linguistically diverse forms{---}using presuppositions, evidential and certainty markers, or varied tones{---}each of which may have a different persuasiveness over the LLMs. We introduce a typology to systematically evaluate how different EoBs affect whether models follow context versus prior knowledge. The typology is grounded in four linguistically motivated dimensions: form, evidentiality, epistemic stance, and tone, spanning 17 fine-grained types. By pairing these EoBs with world knowledge facts, we generate controlled EoB{--}query pairs that isolate the effect of linguistic variation. Using this benchmark, we evaluate 16 LLMs that differ in architecture (Llama3, Qwen3, Gemma3), scale (1B-30B parameters), and training stages (base vs instruct). We identify meaningful variations in response behavior across these axes, e.g., that bigger models and instruction models tend to be less context{--}following than smaller models and base models. We further identify specific EoBs that statistically significantly persuade LMs more consistently than others. Our work reveals systematic patterns in how linguistic framing affects LLM context integration, with implications for prompt engineering and model robustness."
}Markdown (Informal)
[It’s Not What You Say, It’s How You Say It: Evaluating LLM Responses to Expressions of Belief](https://preview.aclanthology.org/ingest-acl/2026.acl-long.142/) (Du et al., ACL 2026)
ACL