Yuhuan Wu
2025
Smart-Searcher: Incentivizing the Dynamic Knowledge Acquisition of LLMs via Reinforcement Learning
Huatong Song
|
Jinhao Jiang
|
Wenqing Tian
|
Zhipeng Chen
|
Yuhuan Wu
|
Jiahao Zhao
|
Yingqian Min
|
Xin Zhao
|
Lei Fang
|
Ji-Rong Wen
Findings of the Association for Computational Linguistics: EMNLP 2025
Large Language Models (LLMs) are powerful but prone to hallucinations due to static knowledge. Retrieval-Augmented Generation (RAG) helps by injecting external information, but current methods often are costly, generalize poorly, or ignore the model’s internal knowledge.In this paper, we introduce Smart-Searcher, a novel framework designed to train LLMs to adaptively leverage both internal and external knowledge sources. Smart-Searcher employs a two-stage training strategy: an initial SFT Cold-start phase for preliminary format learning, followed by RL for Dynamic Knowledge Acquisition. The RL stage uses outcome-supervision to encourage exploration, incorporates a reward mechanism for internal knowledge utilization, and integrates a memorization mechanism to continuously assimilate retrieved information, thereby enriching the model’s internal knowledge. By leveraging internal knowledge and external search engine, the model continuously improves its capabilities, enabling efficient retrieval-augmented reasoning.Our experiments demonstrate that Smart-Searcher outperforms previous RAG and reasoning methods and achieves efficient retrieval.The code is available at https://github.com/RUCAIBox/R1-Searcher-plus.
Search
Fix author
Co-authors
- Zhipeng Chen 1
- Lei Fang 1
- Jinhao Jiang 1
- Yingqian Min 1
- Huatong Song 1
- show all...