Alan Schelten


2025

pdf bib
HalluLens: LLM Hallucination Benchmark
Yejin Bang | Ziwei Ji | Alan Schelten | Anthony Hartshorn | Tara Fowler | Cheng Zhang | Nicola Cancedda | Pascale Fung
Proceedings of the 63rd Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers)

Large language models (LLMs) often generate responses that deviate from user input or training data, a phenomenon known as “hallucination.” These hallucinations undermine user trust and hinder the adoption of generative AI systems. Addressing hallucinations is important for the advancement of LLMs. This paper introduces a comprehensive hallucination benchmark HalluLens, incorporating both extrinsic and intrinsic evaluation tasks, built upon a clear taxonomy of hallucination. A major challenge in benchmarking hallucinations is the lack of a unified framework due to inconsistent definitions and categorizations. We disentangle LLM hallucination from “factuality” and propose a taxonomy distinguishing extrinsic and intrinsic hallucinations to promote consistency and facilitate research. We emphasize extrinsic hallucinations – where generated content deviates from training data – as they become increasingly relevant with LLM advancements. However, no benchmark is solely dedicated to extrinsic hallucinations. To address this gap, HalluLens introduces three new extrinsic tasks with dynamic test set generation to mitigate data leakage and ensure robustness. We release codebase for extrinsic hallucination benchmark.

pdf bib
Calibrating Verbal Uncertainty as a Linear Feature to Reduce Hallucinations
Ziwei Ji | Lei Yu | Yeskendir Koishekenov | Yejin Bang | Anthony Hartshorn | Alan Schelten | Cheng Zhang | Pascale Fung | Nicola Cancedda
Proceedings of the 2025 Conference on Empirical Methods in Natural Language Processing

LLMs often adopt an assertive language style also when making false claims. Such ”overconfident hallucinations” mislead users and erode trust. Achieving the ability to express in language the actual degree of uncertainty around a claim is therefore of great importance. We find that ”verbal uncertainty” is governed by a single linear feature in the representation space of LLMs, and shows that this has only moderate correlation with the actual ”semantic uncertainty” of the model. We apply this insight and show that (1) the mismatch between semantic and verbal uncertainty is a better predictor of hallucinations than semantic uncertainty alone and (2) we can intervene on verbal uncertainty at inference time and reduce confident hallucinations on short-form answers, achieving an average relative reduction of ~30%.