SUNAR: Semantic Uncertainty based Neighborhood Aware Retrieval for Complex QA

Venktesh V, Mandeep Rathee, Avishek Anand


Abstract
Complex question-answering (QA) systems face significant challenges in retrieving and reasoning over information that addresses multifaceted queries. While large language models (LLMs) have advanced the reasoning capabilities of these systems, the bounded-recall problem persists, where procuring all relevant documents in first-stage retrieval remains a challenge. Missing pertinent documents at this stage leads to performance degradation that cannot be remedied in later stages, especially given the limited context windows of LLMs which necessitate high recall at smaller retrieval depths. In this paper, we introduce SUNAR, a novel approach that leverages LLMs to guide a Neighborhood Aware Retrieval process. SUNAR iteratively explores a neighborhood graph of documents, dynamically promoting or penalizing documents based on uncertainty estimates from interim LLM-generated answer candidates. We validate our approach through extensive experiments on two complex QA datasets. Our results show that SUNAR significantly outperforms existing retrieve-and-reason baselines, achieving up to a 31.84% improvement in performance over existing state-of-the-art methods for complex QA. Our code and data are anonymously available at https://anonymous.4open.science/r/SUNAR-8D36/.
Anthology ID:
2025.naacl-long.300
Volume:
Proceedings of the 2025 Conference of the Nations of the Americas Chapter of the Association for Computational Linguistics: Human Language Technologies (Volume 1: Long Papers)
Month:
April
Year:
2025
Address:
Albuquerque, New Mexico
Editors:
Luis Chiruzzo, Alan Ritter, Lu Wang
Venue:
NAACL
SIG:
Publisher:
Association for Computational Linguistics
Note:
Pages:
5818–5835
Language:
URL:
https://preview.aclanthology.org/fix-sig-urls/2025.naacl-long.300/
DOI:
Bibkey:
Cite (ACL):
Venktesh V, Mandeep Rathee, and Avishek Anand. 2025. SUNAR: Semantic Uncertainty based Neighborhood Aware Retrieval for Complex QA. In Proceedings of the 2025 Conference of the Nations of the Americas Chapter of the Association for Computational Linguistics: Human Language Technologies (Volume 1: Long Papers), pages 5818–5835, Albuquerque, New Mexico. Association for Computational Linguistics.
Cite (Informal):
SUNAR: Semantic Uncertainty based Neighborhood Aware Retrieval for Complex QA (V et al., NAACL 2025)
Copy Citation:
PDF:
https://preview.aclanthology.org/fix-sig-urls/2025.naacl-long.300.pdf