Mandeep Rathee

2026

Test-time Corpus Feedback: From Retrieval to RAG
Mandeep Rathee | Venktesh V | Sean MacAvaney | Avishek Anand
Findings of the Association for Computational Linguistics: EACL 2026

Retrieval-Augmented Generation (RAG) has emerged as a standard framework for knowledge-intensive NLP tasks, combining large language models (LLMs) with document retrieval from external corpora. Despite its widespread use, most RAG pipelines continue to treat retrieval and reasoning as isolated components, retrieving documents once and then generating answers without further interaction. This static design often limits performance on complex tasks that require iterative evidence gathering or high-precision retrieval. Recent work in both the information retrieval (IR) and NLP communities has begun to close this gap by introducing adaptive retrieval and ranking methods that incorporate feedback. In this survey, we present a structured overview of advanced retrieval and ranking mechanisms that integrate such feedback. We categorize feedback signals based on their source and role in improving the query, retrieved context, or document pool. By consolidating these developments, we aim to bridge IR and NLP perspectives and highlight retrieval as a dynamic, learnable component of end-to-end RAG systems.

2025

pdf bib abs

SUNAR: Semantic Uncertainty based Neighborhood Aware Retrieval for Complex QA
Venktesh V | Mandeep Rathee | Avishek Anand
Proceedings of the 2025 Conference of the Nations of the Americas Chapter of the Association for Computational Linguistics: Human Language Technologies (Volume 1: Long Papers)

Complex question-answering (QA) systems face significant challenges in retrieving and reasoning over information that addresses multifaceted queries. While large language models (LLMs) have advanced the reasoning capabilities of these systems, the bounded-recall problem persists, where procuring all relevant documents in first-stage retrieval remains a challenge. Missing pertinent documents at this stage leads to performance degradation that cannot be remedied in later stages, especially given the limited context windows of LLMs which necessitate high recall at smaller retrieval depths. In this paper, we introduce SUNAR, a novel approach that leverages LLMs to guide a Neighborhood Aware Retrieval process. SUNAR iteratively explores a neighborhood graph of documents, dynamically promoting or penalizing documents based on uncertainty estimates from interim LLM-generated answer candidates. We validate our approach through extensive experiments on two complex QA datasets. Our results show that SUNAR significantly outperforms existing retrieve-and-reason baselines, achieving up to a 31.84% improvement in performance over existing state-of-the-art methods for complex QA. Our code and data are anonymously available at https://anonymous.4open.science/r/SUNAR-8D36/.

Co-authors

Venues

Findings1
NAACL1

Fix author