Beyond Facts: Evaluating Intent Hallucination in Large Language Models

Yijie Hao; Haofei Yu; Jiaxuan You

Beyond Facts: Evaluating Intent Hallucination in Large Language Models

Abstract

When exposed to complex queries containing multiple conditions, today’s large language models (LLMs) tend to produce responses that only partially satisfy the query while neglecting certain conditions. We, therefore, introduce the concept of Intent Hallucination, a phenomenon where LLMs either omit (failing to address certain parts) or misinterpret (responding to invented query parts) elements of the given query, leading to responses misaligned with the original query. To systematically evaluate intent hallucination, we introduce FAITHQA, a novel benchmark for intent hallucination that contains 20,068 problems, covering both query-only and retrieval-augmented generation (RAG) setups with varying topics and difficulty. FAITHQA is the first hallucination benchmark that goes beyond factual verification, tailored to identify the fundamental cause of intent hallucination. By evaluating various LLMs on FAITHQA, we find that (1) intent hallucination is a common issue even for state-of-the-art models, and (2) such a phenomenon stems from omission or misinterpretation of LLMs. To facilitate future research, we introduce an automatic LLM generation evaluation metric, named INTENT CONSTRAINT, for detecting intent hallucination. Human evaluation results demonstrate that INTENT CONSTRAINT is closer to human performance for intent hallucination compared to baselines.

Anthology ID:: 2025.acl-long.349
Volume:: Proceedings of the 63rd Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers)
Month:: July
Year:: 2025
Address:: Vienna, Austria
Editors:: Wanxiang Che, Joyce Nabende, Ekaterina Shutova, Mohammad Taher Pilehvar
Venue:: ACL
SIG:
Publisher:: Association for Computational Linguistics
Note:
Pages:: 7046–7069
Language:
URL:: https://preview.aclanthology.org/ingestion-acl-25/2025.acl-long.349/
DOI:
Bibkey:
Cite (ACL):: Yijie Hao, Haofei Yu, and Jiaxuan You. 2025. Beyond Facts: Evaluating Intent Hallucination in Large Language Models. In Proceedings of the 63rd Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), pages 7046–7069, Vienna, Austria. Association for Computational Linguistics.
Cite (Informal):: Beyond Facts: Evaluating Intent Hallucination in Large Language Models (Hao et al., ACL 2025)
Copy Citation:
PDF:: https://preview.aclanthology.org/ingestion-acl-25/2025.acl-long.349.pdf

PDF Cite Search Fix data