Gradient-guided Attention Map Editing: Towards Efficient Contextual Hallucination Mitigation
Yu Wang, Jiaxin Zhang, Xiang Gao, Wendi Cui, Peng Li, Kamalika Das
Abstract
In tasks such as summarization and open-book question answering (QA), Large Language Models (LLMs) frequently experience “contextual hallucination”, where they generate irrelevant or incorrect responses despite having access to accurate information in the input. This issue often stems from the models’ propensity to prioritize self-generated content over input context, leading to a disregard for pertinent details. To address this challenge, we introduce, Guided Attention Map Editing (GAME), an innovative approach that dynamically adjusts attention maps to enhance contextual relevance. During inference, GAME employs a trained classifier to identify attention maps likely to induce hallucinations and implements targeted interventions. These interventions, guided by gradient-informed “edit directions”, strategically redistribute attention weights across various heads to efficiently mitigate hallucination. Extensive evaluations on challenging summarization and open-book QA tasks demonstrate that GAME consistently and significantly reduces hallucinations across diverse open-source models, thereby improving the reliability and applicability of LLMs.- Anthology ID:
- 2025.findings-naacl.458
- Volume:
- Findings of the Association for Computational Linguistics: NAACL 2025
- Month:
- April
- Year:
- 2025
- Address:
- Albuquerque, New Mexico
- Editors:
- Luis Chiruzzo, Alan Ritter, Lu Wang
- Venue:
- Findings
- SIG:
- Publisher:
- Association for Computational Linguistics
- Note:
- Pages:
- 8206–8217
- Language:
- URL:
- https://preview.aclanthology.org/Ingest-2025-COMPUTEL/2025.findings-naacl.458/
- DOI:
- Cite (ACL):
- Yu Wang, Jiaxin Zhang, Xiang Gao, Wendi Cui, Peng Li, and Kamalika Das. 2025. Gradient-guided Attention Map Editing: Towards Efficient Contextual Hallucination Mitigation. In Findings of the Association for Computational Linguistics: NAACL 2025, pages 8206–8217, Albuquerque, New Mexico. Association for Computational Linguistics.
- Cite (Informal):
- Gradient-guided Attention Map Editing: Towards Efficient Contextual Hallucination Mitigation (Wang et al., Findings 2025)
- PDF:
- https://preview.aclanthology.org/Ingest-2025-COMPUTEL/2025.findings-naacl.458.pdf