Test-Time Strategies for More Efficient and Accurate Agentic RAG

Abhinav Sharma; Brian Zhang; Deepti Guntur; Zhiyang Zuo; Shreyas Chaudhari; Wenlong Zhao; Franck Dernoncourt; Puneet Mathur; Ryan A. Rossi; Nedim Lipka

Test-Time Strategies for More Efficient and Accurate Agentic RAG

Abhinav Sharma, Brian Zhang, Deepti Guntur, Zhiyang Zuo, Shreyas Chaudhari, Wenlong Zhao, Franck Dernoncourt, Puneet Mathur, Ryan A. Rossi, Nedim Lipka

Abstract

Retrieval-Augmented Generation (RAG) systems face challenges with complex, multi-hop questions, and iterative agentic frameworks such as Search-R1 (Jin et al., 2025) have been proposed to address these complexities. However, such approaches can introduce inefficiencies, including repetitive retrieval of previously processed information and challenges in contextualizing retrieved results effectively within the current generation prompt. Such issues can lead to unnecessary retrieval turns, suboptimal reasoning, inaccurate answers, and increased token consumption. In this paper, we investigate test-time modifications to Search-R1’s open-source Qwen2.5-7B pipeline to mitigate these identified shortcomings. Specifically, we explore the integration of two components and their combination: a contextualization module to better integrate relevant information from retrieved documents into reasoning, and a de-duplication module that replaces previously retrieved documents with the next most relevant ones. We evaluate our approaches using the HotpotQA (Yang et al., 2018) and the Natural Questions (Kwiatkowski et al., 2019) datasets, reporting the exact match (EM) score, an LLM-as-a-Judge assessment of answer correctness, and the average number of turns. Our best-performing variant (contextualization) achieves a 5.6% increase in EM score and reduces the average number of turns by 10.5% compared to the Search-R1 baseline. While contextualization itself introduces additional LLM calls, our results demonstrate improved answer accuracy and reduced retrieval load.

Anthology ID:: 2026.acl-srw.41
Volume:: Proceedings of the 64th Annual Meeting of the Association for Computational Linguistics (ACL 2026)
Month:: July
Year:: 2026
Address:: San Diego, California, United States
Editors:: Santosh T.Y.S.S., Juan Diego Rodriguez, Ona de Gibert
Venue:: ACL
SIG:
Publisher:: Association for Computational Linguistics
Note:
Pages:: 463–469
Language:
URL:: https://preview.aclanthology.org/ingest-acl/2026.acl-srw.41/
DOI:
Bibkey:
Cite (ACL):: Abhinav Sharma, Brian Zhang, Deepti Guntur, Zhiyang Zuo, Shreyas Chaudhari, Wenlong Zhao, Franck Dernoncourt, Puneet Mathur, Ryan A. Rossi, and Nedim Lipka. 2026. Test-Time Strategies for More Efficient and Accurate Agentic RAG. In Proceedings of the 64th Annual Meeting of the Association for Computational Linguistics (ACL 2026), pages 463–469, San Diego, California, United States. Association for Computational Linguistics.
Cite (Informal):: Test-Time Strategies for More Efficient and Accurate Agentic RAG (Sharma et al., ACL 2026)
Copy Citation:
PDF:: https://preview.aclanthology.org/ingest-acl/2026.acl-srw.41.pdf

PDF Cite Search Fix data