CondAmbigQA: A Benchmark and Dataset for Conditional Ambiguous Question Answering

Zongxi Li, Yang Li, Haoran Xie, S. Joe Qin


Abstract
Users often assume that large language models (LLMs) share their cognitive alignment of context and intent, leading them to omit critical information in question-answering (QA) and produce ambiguous queries. Responses based on misaligned assumptions may be perceived as hallucinations. Therefore, identifying possible implicit assumptions is crucial in QA. To address this fundamental challenge, we propose Conditional Ambiguous Question-Answering (CondAmbigQA), a benchmark comprising 2,000 ambiguous queries and condition-aware evaluation metrics. Our study pioneers “conditions” as explicit contextual constraints that resolve ambiguities in QA tasks through retrieval-based annotation, where retrieved Wikipedia fragments help identify possible interpretations for a given query and annotate answers accordingly. Experiments demonstrate that models considering conditions before answering improve answer accuracy by 11.75%, with an additional 7.15% gain when conditions are explicitly provided. These results highlight that apparent hallucinations may stem from inherent query ambiguity rather than model failure, and demonstrate the effectiveness of condition reasoning in QA, providing researchers with tools for rigorous evaluation.
Anthology ID:
2025.emnlp-main.115
Volume:
Proceedings of the 2025 Conference on Empirical Methods in Natural Language Processing
Month:
November
Year:
2025
Address:
Suzhou, China
Editors:
Christos Christodoulopoulos, Tanmoy Chakraborty, Carolyn Rose, Violet Peng
Venue:
EMNLP
SIG:
Publisher:
Association for Computational Linguistics
Note:
Pages:
2269–2288
Language:
URL:
https://preview.aclanthology.org/ingest-emnlp/2025.emnlp-main.115/
DOI:
Bibkey:
Cite (ACL):
Zongxi Li, Yang Li, Haoran Xie, and S. Joe Qin. 2025. CondAmbigQA: A Benchmark and Dataset for Conditional Ambiguous Question Answering. In Proceedings of the 2025 Conference on Empirical Methods in Natural Language Processing, pages 2269–2288, Suzhou, China. Association for Computational Linguistics.
Cite (Informal):
CondAmbigQA: A Benchmark and Dataset for Conditional Ambiguous Question Answering (Li et al., EMNLP 2025)
Copy Citation:
PDF:
https://preview.aclanthology.org/ingest-emnlp/2025.emnlp-main.115.pdf
Checklist:
 2025.emnlp-main.115.checklist.pdf