Q-PRM: Adaptive Query Rewriting for Retrieval-Augmented Generation via Step-level Process Supervision
Xiaopeng Ye, Chen Xu, Chaoliang Zhang, Zhaocheng Du, Jun Xu, Gang Wang, Zhenhua Dong
Abstract
Query rewriting plays a pivotal role in Retrieval-Augmented Generation (RAG) by refining real-world queries of varying complexity. Existing approaches typically rely on outcome-supervised training or heuristic rules to guide the rewriting process. However, these paradigms often struggle to handle queries with varying levels of complexity, posing over- and under-refinement problems. We identify the root cause of these issues as the absence of supervision signals for intermediate steps. To fully construct and utilize such signals, we propose Q-PRM, a novel query rewriting framework. Q-PRM reformulates the rewriting process as a Markov Decision Process (MDP) composed of atomic rewriting steps. In this way, Q-PRM can apply process-level supervision to each atomic step according to the query type, offering more targeted and effective guidance. Q-PRM comprises three key stages: (1) applying Monte Carlo Tree Search to generate step-level process supervision signals; (2) performing reinforced self-training for progressive process refinement; and (3) employing PRM-guided decoding during inference. Experiments on several open-domain QA benchmarks demonstrate that Q-PRM consistently outperforms baselines across different levels of query complexity.- Anthology ID:
- 2025.findings-emnlp.817
- Volume:
- Findings of the Association for Computational Linguistics: EMNLP 2025
- Month:
- November
- Year:
- 2025
- Address:
- Suzhou, China
- Editors:
- Christos Christodoulopoulos, Tanmoy Chakraborty, Carolyn Rose, Violet Peng
- Venue:
- Findings
- SIG:
- Publisher:
- Association for Computational Linguistics
- Note:
- Pages:
- 15113–15128
- Language:
- URL:
- https://preview.aclanthology.org/name-variant-enfa-fane/2025.findings-emnlp.817/
- DOI:
- 10.18653/v1/2025.findings-emnlp.817
- Cite (ACL):
- Xiaopeng Ye, Chen Xu, Chaoliang Zhang, Zhaocheng Du, Jun Xu, Gang Wang, and Zhenhua Dong. 2025. Q-PRM: Adaptive Query Rewriting for Retrieval-Augmented Generation via Step-level Process Supervision. In Findings of the Association for Computational Linguistics: EMNLP 2025, pages 15113–15128, Suzhou, China. Association for Computational Linguistics.
- Cite (Informal):
- Q-PRM: Adaptive Query Rewriting for Retrieval-Augmented Generation via Step-level Process Supervision (Ye et al., Findings 2025)
- PDF:
- https://preview.aclanthology.org/name-variant-enfa-fane/2025.findings-emnlp.817.pdf