Harmful Factuality: LLMs Correcting What They Shouldn’t

Mingchen Li; Hanzhi Zhang; Heng Fan; Junhua Ding; Yunhe Feng

Harmful Factuality: LLMs Correcting What They Shouldn’t

Mingchen Li, Hanzhi Zhang, Heng Fan, Junhua Ding, Yunhe Feng

Abstract

While Large Language Models (LLMs) are trained for factual accuracy, this objective can directly conflict with the critical demand for source fidelity. This paper isolates and formalizes this conflict as Harmful Factuality Hallucination (HFH): a previously overlooked failure mode where an LLM’s attempt to “correct” perceived source errors results in an output that is factually true but unfaithful to the input. Unlike traditional hallucination research focused on models generating falsehoods, we investigate the harm of misplaced correctness. We introduce a reproducible framework to elicit and measure HFH using controlled entity-level perturbations (both soft, embedding-based and hard, instruction-based) paired with strategic entity selection. Across summarization, rephrasing, and QA tasks, our evaluation of diverse LLMs reveals that HFH is a prevalent behavior that worsens with model scale. We identify three underlying mechanisms and demonstrate that a simple instructional prompt can reduce HFH rates by approximately 50%. Our framework turns the abstract factuality–faithfulness tension into a measurable, actionable target for building more reliable LLM systems. Our code is publicly available at https://github.com/ResponsibleAILab/Harmful-Factuality-Hallucination.

Anthology ID:: 2026.findings-eacl.46
Volume:: Findings of the Association for Computational Linguistics: EACL 2026
Month:: March
Year:: 2026
Address:: Rabat, Morocco
Editors:: Vera Demberg, Kentaro Inui, Lluís Marquez
Venue:: Findings
SIG:
Publisher:: Association for Computational Linguistics
Note:
Pages:: 896–912
Language:
URL:: https://preview.aclanthology.org/ingest-eacl/2026.findings-eacl.46/
DOI:
Bibkey:
Cite (ACL):: Mingchen Li, Hanzhi Zhang, Heng Fan, Junhua Ding, and Yunhe Feng. 2026. Harmful Factuality: LLMs Correcting What They Shouldn’t. In Findings of the Association for Computational Linguistics: EACL 2026, pages 896–912, Rabat, Morocco. Association for Computational Linguistics.
Cite (Informal):: Harmful Factuality: LLMs Correcting What They Shouldn’t (Li et al., Findings 2026)
Copy Citation:
PDF:: https://preview.aclanthology.org/ingest-eacl/2026.findings-eacl.46.pdf
Checklist:: 2026.findings-eacl.46.checklist.pdf

PDF Cite Search Checklist Fix data