CriticSearch: Fine-Grained Credit Assignment for Search Agents via a Retrospective Critic

Yaocheng Zhang, Haohuan Huang, Zijun Song, Zijie Zhao, Qichao Zhang, Yuanheng Zhu, Dongbin Zhao


Abstract
Tool-Integrated Reasoning (TIR) with search engines enables large language models to iteratively retrieve up-to-date external knowledge, enhancing adaptability and generalization in complex question-answering tasks. However, existing search agent pipelines typically depend on reinforcement learning based optimization, which often suffers from sparse outcome rewards, leading to inefficient exploration and unstable training. We introduce CriticSearch, a fine-grained credit-assignment framework that supplies dense, turn-level feedback via a retrospective critic mechanism. During training, a frozen, asymmetric critique LLM retrospectively evaluates each turn using privileged information from the full trajectory and gold answers, converting these assessments into stable, dense rewards that guide policy improvement. Experimental results across diverse multi-hop reasoning benchmarks demonstrate that CriticSearch consistently outperforms existing baselines, achieving faster convergence, improved training stability, and higher performance.
Anthology ID:
2026.findings-acl.596
Volume:
Findings of the Association for Computational Linguistics: ACL 2026
Month:
July
Year:
2026
Address:
San Diego, California, United States
Editors:
Maria Liakata, Viviane P. Moreira, Jiajun Zhang, David Jurgens
Venue:
Findings
SIG:
Publisher:
Association for Computational Linguistics
Note:
Pages:
12272–12290
Language:
URL:
https://preview.aclanthology.org/ingest-acl/2026.findings-acl.596/
DOI:
Bibkey:
Cite (ACL):
Yaocheng Zhang, Haohuan Huang, Zijun Song, Zijie Zhao, Qichao Zhang, Yuanheng Zhu, and Dongbin Zhao. 2026. CriticSearch: Fine-Grained Credit Assignment for Search Agents via a Retrospective Critic. In Findings of the Association for Computational Linguistics: ACL 2026, pages 12272–12290, San Diego, California, United States. Association for Computational Linguistics.
Cite (Informal):
CriticSearch: Fine-Grained Credit Assignment for Search Agents via a Retrospective Critic (Zhang et al., Findings 2026)
Copy Citation:
PDF:
https://preview.aclanthology.org/ingest-acl/2026.findings-acl.596.pdf
Checklist:
 2026.findings-acl.596.checklist.pdf