What Makes a Good Query? Measuring the Impact of Human-Confusing Linguistic Features on LLM Performance

William Watson; Nicole Cho; Sumitra Ganesh; Manuela Veloso

What Makes a Good Query? Measuring the Impact of Human-Confusing Linguistic Features on LLM Performance

William Watson, Nicole Cho, Sumitra Ganesh, Manuela Veloso

Abstract

Large Language Model (LLM) hallucinations are usually treated as defects of the model or its decoding strategy. Drawing on classical linguistics, we argue that a query’s form can also shape a listener’s (and model’s) response. We operationalize this insight by constructing a 22-dimension query feature vector covering clause complexity, lexical rarity, and anaphora, negation, answerability, and intention grounding, all known to affect human comprehension. Using 369,837 real-world queries, we ask: Are there certain types of queries that make hallucination more likely? A large-scale analysis reveals a consistent "risk landscape": certain features such as deep clause nesting and underspecification align with higher hallucination propensity. In contrast, clear intention grounding and answerability align with lower hallucination rates. Others, including domain specificity, show mixed, dataset- and model-dependent effects. Thus, these findings establish an empirically observable query-feature representation correlated with hallucination risk, paving the way for guided query rewriting and future intervention studies.

Anthology ID:: 2026.findings-eacl.251
Volume:: Findings of the Association for Computational Linguistics: EACL 2026
Month:: March
Year:: 2026
Address:: Rabat, Morocco
Editors:: Vera Demberg, Kentaro Inui, Lluís Marquez
Venue:: Findings
SIG:
Publisher:: Association for Computational Linguistics
Note:
Pages:: 4794–4827
Language:
URL:: https://preview.aclanthology.org/ingest-eacl/2026.findings-eacl.251/
DOI:
Bibkey:
Cite (ACL):: William Watson, Nicole Cho, Sumitra Ganesh, and Manuela Veloso. 2026. What Makes a Good Query? Measuring the Impact of Human-Confusing Linguistic Features on LLM Performance. In Findings of the Association for Computational Linguistics: EACL 2026, pages 4794–4827, Rabat, Morocco. Association for Computational Linguistics.
Cite (Informal):: What Makes a Good Query? Measuring the Impact of Human-Confusing Linguistic Features on LLM Performance (Watson et al., Findings 2026)
Copy Citation:
PDF:: https://preview.aclanthology.org/ingest-eacl/2026.findings-eacl.251.pdf
Checklist:: 2026.findings-eacl.251.checklist.pdf

PDF Cite Search Checklist Fix data