Born Pragmatic, Trained to Hallucinate? Quantifying the Origins of Contextual Bias in LLMs via the PaCE Benchmark

Ziming Li, Yu Tian, Tian Lan, Jiang Li, Zehua Duo, Guanglai Gao, Xiangdong Su


Abstract
While Large Language Models (LLMs) excel at capturing communicative intent, this capability introduces a side effect: Pragmatic Hallucination, where models over-interpret literal contexts to generate non-factual inferences. To quantify this, we introduce the PaCE (Pragmatics-as-Context Evaluation) benchmark, comprising over 3,000 manually verified "context-flip" samples. Evaluations across nine mainstream models reveal a significant Context Sensitivity Gap (CSG), with literal accuracy consistently lagging behind pragmatic reasoning. Attribution analysis indicates that Reinforcement Learning from Human Feedback (RLHF) exacerbates this bias, and neither parameter scaling nor Chain-of-Thought (CoT) fully mitigates it. Crucially, "Strict Prompting" effectively reverses the CSG, demonstrating that the phenomenon stems from behavioral lock-in during training rather than inherent capability deficiencies. Furthermore, error patterns exhibit high systematic correlation across diverse architectures. This study highlights that current alignment paradigms lack precise control over pragmatic boundaries, underscoring the necessity for a "Literal Grounding" mechanism in future safety frameworks.
Anthology ID:
2026.findings-acl.959
Volume:
Findings of the Association for Computational Linguistics: ACL 2026
Month:
July
Year:
2026
Address:
San Diego, California, United States
Editors:
Maria Liakata, Viviane P. Moreira, Jiajun Zhang, David Jurgens
Venue:
Findings
SIG:
Publisher:
Association for Computational Linguistics
Note:
Pages:
19211–19235
Language:
URL:
https://preview.aclanthology.org/ingest-acl/2026.findings-acl.959/
DOI:
Bibkey:
Cite (ACL):
Ziming Li, Yu Tian, Tian Lan, Jiang Li, Zehua Duo, Guanglai Gao, and Xiangdong Su. 2026. Born Pragmatic, Trained to Hallucinate? Quantifying the Origins of Contextual Bias in LLMs via the PaCE Benchmark. In Findings of the Association for Computational Linguistics: ACL 2026, pages 19211–19235, San Diego, California, United States. Association for Computational Linguistics.
Cite (Informal):
Born Pragmatic, Trained to Hallucinate? Quantifying the Origins of Contextual Bias in LLMs via the PaCE Benchmark (Li et al., Findings 2026)
Copy Citation:
PDF:
https://preview.aclanthology.org/ingest-acl/2026.findings-acl.959.pdf
Checklist:
 2026.findings-acl.959.checklist.pdf