Attention Consistency for LLMs Explanation

Tian Lan, Jinyuan Xu, Xue He, Jenq-Neng Hwang, Lei Li


Abstract
Understanding the decision-making processes of large language models (LLMs) is essential for their trustworthy development and deployment, however, current interpretability methods often face challenges such as low resolution and high computational cost. To address these limitations, we propose the Multi-Layer Attention Consistency Score (MACS), a novel, lightweight, and easily deployable heuristic for estimating the importance of input tokens in decoder-based models. MACS measures contributions of input tokens based on the consistency of maximal attention. Empirical evaluations demonstrate that MACS achieves a favorable trade-off between interpretability quality and computational efficiency, showing faithfulness comparable to complex techniques with a 22% decrease in VRAM usage and 30% reduction in latency.
Anthology ID:
2025.findings-emnlp.91
Volume:
Findings of the Association for Computational Linguistics: EMNLP 2025
Month:
November
Year:
2025
Address:
Suzhou, China
Editors:
Christos Christodoulopoulos, Tanmoy Chakraborty, Carolyn Rose, Violet Peng
Venue:
Findings
SIG:
Publisher:
Association for Computational Linguistics
Note:
Pages:
1736–1750
Language:
URL:
https://preview.aclanthology.org/author-page-yu-wang-polytechnic/2025.findings-emnlp.91/
DOI:
10.18653/v1/2025.findings-emnlp.91
Bibkey:
Cite (ACL):
Tian Lan, Jinyuan Xu, Xue He, Jenq-Neng Hwang, and Lei Li. 2025. Attention Consistency for LLMs Explanation. In Findings of the Association for Computational Linguistics: EMNLP 2025, pages 1736–1750, Suzhou, China. Association for Computational Linguistics.
Cite (Informal):
Attention Consistency for LLMs Explanation (Lan et al., Findings 2025)
Copy Citation:
PDF:
https://preview.aclanthology.org/author-page-yu-wang-polytechnic/2025.findings-emnlp.91.pdf
Checklist:
 2025.findings-emnlp.91.checklist.pdf