TPA: Next Token Probability Attribution for Detecting Hallucinations in RAG

Pengqian Lu; Jie Lu; Anjin Liu; Guangquan Zhang

TPA: Next Token Probability Attribution for Detecting Hallucinations in RAG

Pengqian Lu, Jie Lu, Anjin Liu, Guangquan Zhang

Abstract

Detecting hallucinations in Retrieval-Augmented Generation remains a challenge. Prior approaches attribute hallucinations to a binary conflict between internal knowledge stored in FFNs and the retrieved context. However, this perspective is incomplete, failing to account for the impact of other components of the LLM, such as the user query, previously generated tokens, the self token, and the final LayerNorm adjustment. To comprehensively capture the impact of these components on hallucination detection, we propose TPA which mathematically attributes each token’s probability to seven distinct sources: Query, RAG Context, Past Token, Self Token, FFN, Final LayerNorm, and Initial Embedding. This attribution quantifies how each source contributes to the generation of the next token. Specifically, we aggregate these attribution scores by Part-of-Speech (POS) tags to quantify the contribution of each model component to the generation of specific linguistic categories within a response. By leveraging these patterns, such as detecting anomalies where Nouns rely heavily on LayerNorm, TPA effectively identifies hallucinated responses. Extensive experiments show that TPA achieves state-of-the-art performance.

Anthology ID:: 2026.acl-long.1159
Volume:: Proceedings of the 64th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers)
Month:: July
Year:: 2026
Address:: San Diego, California, United States
Editors:: Maria Liakata, Viviane P. Moreira, Jiajun Zhang, David Jurgens
Venue:: ACL
SIG:
Publisher:: Association for Computational Linguistics
Note:
Pages:: 25273–25292
Language:
URL:: https://preview.aclanthology.org/ingest-acl/2026.acl-long.1159/
DOI:
Bibkey:
Cite (ACL):: Pengqian Lu, Jie Lu, Anjin Liu, and Guangquan Zhang. 2026. TPA: Next Token Probability Attribution for Detecting Hallucinations in RAG. In Proceedings of the 64th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), pages 25273–25292, San Diego, California, United States. Association for Computational Linguistics.
Cite (Informal):: TPA: Next Token Probability Attribution for Detecting Hallucinations in RAG (Lu et al., ACL 2026)
Copy Citation:
PDF:: https://preview.aclanthology.org/ingest-acl/2026.acl-long.1159.pdf
Checklist:: 2026.acl-long.1159.checklist.pdf

PDF Cite Search Checklist Fix data