Unanswerability Evaluation for Retrieval Augmented Generation

Xiangyu Peng; Prafulla Kumar Choubey; Caiming Xiong; Chien-Sheng Wu

Unanswerability Evaluation for Retrieval Augmented Generation

Xiangyu Peng, Prafulla Kumar Choubey, Caiming Xiong, Chien-Sheng Wu

Abstract

Existing evaluation frameworks for retrieval-augmented generation (RAG) systems focus on answerable queries, but they overlook the importance of appropriately rejecting unanswerable requests. In this paper, we introduce UAEval4RAG, a comprehensive evaluation framework designed to evaluate whether RAG systems effectively handle unanswerable queries specific to a given knowledge base. We first define a taxonomy with six unanswerable categories, and UAEval4RAG automatically synthesizes diverse and challenging queries for any given knowledge base and evaluate the RAG systems with unanswered ratio and acceptable ratio metrics. We also conduct experiments with various RAG components and prompting strategies across four datasets, which reveals that due to varying knowledge distribution across datasets, no single configuration consistently delivers optimal performance on both answerable and unanswerable requests across different knowledge bases. Our findings highlight the critical role of component selection and prompt design in optimizing RAG systems to balance the accuracy of answerable queries with high rejection rates of unanswerable ones. UAEval4RAG provides valuable insights and tools for developing more robust and reliable RAG systems.

Anthology ID:: 2025.acl-long.415
Volume:: Proceedings of the 63rd Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers)
Month:: July
Year:: 2025
Address:: Vienna, Austria
Editors:: Wanxiang Che, Joyce Nabende, Ekaterina Shutova, Mohammad Taher Pilehvar
Venue:: ACL
SIG:
Publisher:: Association for Computational Linguistics
Note:
Pages:: 8452–8472
Language:
URL:: https://preview.aclanthology.org/ingestion-acl-25/2025.acl-long.415/
DOI:
Bibkey:
Cite (ACL):: Xiangyu Peng, Prafulla Kumar Choubey, Caiming Xiong, and Chien-Sheng Wu. 2025. Unanswerability Evaluation for Retrieval Augmented Generation. In Proceedings of the 63rd Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), pages 8452–8472, Vienna, Austria. Association for Computational Linguistics.
Cite (Informal):: Unanswerability Evaluation for Retrieval Augmented Generation (Peng et al., ACL 2025)
Copy Citation:
PDF:: https://preview.aclanthology.org/ingestion-acl-25/2025.acl-long.415.pdf

PDF Cite Search Fix data