Hallucination Detection in LLMs Using Spectral Features of Attention Maps

Jakub Binkowski; Denis Janiak; Albert Sawczyn; Bogdan Gabrys; Tomasz Jan Kajdanowicz

Hallucination Detection in LLMs Using Spectral Features of Attention Maps

Jakub Binkowski, Denis Janiak, Albert Sawczyn, Bogdan Gabrys, Tomasz Jan Kajdanowicz

Abstract

Large Language Models (LLMs) have demonstrated remarkable performance across various tasks but remain prone to hallucinations. Detecting hallucinations is essential for safety-critical applications, and recent methods leverage attention map properties to this end, though their effectiveness remains limited. In this work, we investigate the spectral features of attention maps by interpreting them as adjacency matrices of graph structures. We propose the LapEigvals method, which utilises the top-k eigenvalues of the Laplacian matrix derived from the attention maps as an input to hallucination detection probes. Empirical evaluations demonstrate that our approach achieves state-of-the-art hallucination detection performance among attention-based methods. Extensive ablation studies further highlight the robustness and generalisation of LapEigvals, paving the way for future advancements in the hallucination detection domain.

Anthology ID:: 2025.emnlp-main.1239
Volume:: Proceedings of the 2025 Conference on Empirical Methods in Natural Language Processing
Month:: November
Year:: 2025
Address:: Suzhou, China
Editors:: Christos Christodoulopoulos, Tanmoy Chakraborty, Carolyn Rose, Violet Peng
Venue:: EMNLP
SIG:
Publisher:: Association for Computational Linguistics
Note:
Pages:: 24365–24396
Language:
URL:: https://preview.aclanthology.org/ingest-emnlp/2025.emnlp-main.1239/
DOI:
Bibkey:
Cite (ACL):: Jakub Binkowski, Denis Janiak, Albert Sawczyn, Bogdan Gabrys, and Tomasz Jan Kajdanowicz. 2025. Hallucination Detection in LLMs Using Spectral Features of Attention Maps. In Proceedings of the 2025 Conference on Empirical Methods in Natural Language Processing, pages 24365–24396, Suzhou, China. Association for Computational Linguistics.
Cite (Informal):: Hallucination Detection in LLMs Using Spectral Features of Attention Maps (Binkowski et al., EMNLP 2025)
Copy Citation:
PDF:: https://preview.aclanthology.org/ingest-emnlp/2025.emnlp-main.1239.pdf
Checklist:: 2025.emnlp-main.1239.checklist.pdf

PDF Cite Search Checklist Fix data