Rong Shan
2026
A Comprehensive Survey of Process Reward Models: Data Generation, Model Construction, and Usage
Congmin Zheng | Jiachen Zhu | Zhuoying Ou | Yuxiang Chen | Kangning Zhang | Rong Shan | Zeyu Zheng | Mengyue Yang | Jianghao Lin | Yong Yu | Weinan Zhang
Proceedings of the 64th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers)
Congmin Zheng | Jiachen Zhu | Zhuoying Ou | Yuxiang Chen | Kangning Zhang | Rong Shan | Zeyu Zheng | Mengyue Yang | Jianghao Lin | Yong Yu | Weinan Zhang
Proceedings of the 64th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers)
Large Language Models (LLMs) have advanced reasoning ability, yet conventional alignment remains dominated by outcome reward models (ORMs) that judge only final answers. Process Reward Models(PRMs) address this gap by evaluating and guiding reasoning at the step or trajectory level. This survey provides a systematic overview of PRMs through the full loop: how to generate process data, build PRMs, and use PRMs for test-time scaling and reinforcement learning. We summarize applications across math, code, text, multimodal reasoning, robotics, and agents, and review emerging benchmarks. Our goal is to clarify design spaces, reveal open challenges, and guide future research toward fine-grained, robust reasoning alignment.
A Survey of Large Language Model-Based Search Agents
Yunjia Xi | Jianghao Lin | Yongzhao Xiao | Zheli Zhou | Rong Shan | Te Gao | Jiachen Zhu | Weiwen Liu | Yong Yu | Weinan Zhang
Proceedings of the 64th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers)
Yunjia Xi | Jianghao Lin | Yongzhao Xiao | Zheli Zhou | Rong Shan | Te Gao | Jiachen Zhu | Weiwen Liu | Yong Yu | Weinan Zhang
Proceedings of the 64th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers)
The advent of Large Language Models (LLMs) has significantly revolutionized web search. The emergence of LLM-based Search Agents marks a pivotal shift towards deeper, dynamic, autonomous information seeking. These agents can comprehend user intentions and environment context and execute multi-turn retrieval with dynamic planning, extending search capabilities far beyond the web. Leading examples like OpenAI’s Deep Research highlight their potential for deep information mining and real-world applications. This survey provides the first systematic analysis of search agents. We comprehensively analyze and categorize existing works from the perspectives of architecture, optimization, application, and evaluation, ultimately identifying critical open challenges and outlining promising future research directions in this rapidly evolving field.