Query Optimization for Parametric Knowledge Refinement in Retrieval-Augmented Large Language Models

Youan Cong, Pritom Saha Akash, Cheng Wang, Kevin Chen-Chuan Chang


Abstract
We introduce the Extract-Refine-Retrieve-Read (ERRR) framework, a novel approach designed to bridge the pre-retrieval information gap in Retrieval-Augmented Generation (RAG) systems through query optimization tailored to meet the specific knowledge requirements of Large Language Models (LLMs). Unlike conventional query optimization techniques used in RAG, the ERRR framework begins by extracting parametric knowledge from LLMs, followed by using a specialized query optimizer for refining these queries. This process ensures the retrieval of only the most pertinent information essential for generating accurate responses. Moreover, to enhance flexibility and reduce computational costs, we propose a trainable scheme for our pipeline that utilizes a smaller, tunable model as the query optimizer, which is refined through knowledge distillation from a larger teacher model. Our evaluations on various question-answering (QA) datasets and with different retrieval systems show that ERRR consistently outperforms existing baselines, proving to be a versatile and cost-effective module for improving the utility and accuracy of RAG systems.
Anthology ID:
2025.findings-emnlp.193
Volume:
Findings of the Association for Computational Linguistics: EMNLP 2025
Month:
November
Year:
2025
Address:
Suzhou, China
Editors:
Christos Christodoulopoulos, Tanmoy Chakraborty, Carolyn Rose, Violet Peng
Venue:
Findings
SIG:
Publisher:
Association for Computational Linguistics
Note:
Pages:
3615–3625
Language:
URL:
https://preview.aclanthology.org/author-page-yu-wang-polytechnic/2025.findings-emnlp.193/
DOI:
10.18653/v1/2025.findings-emnlp.193
Bibkey:
Cite (ACL):
Youan Cong, Pritom Saha Akash, Cheng Wang, and Kevin Chen-Chuan Chang. 2025. Query Optimization for Parametric Knowledge Refinement in Retrieval-Augmented Large Language Models. In Findings of the Association for Computational Linguistics: EMNLP 2025, pages 3615–3625, Suzhou, China. Association for Computational Linguistics.
Cite (Informal):
Query Optimization for Parametric Knowledge Refinement in Retrieval-Augmented Large Language Models (Cong et al., Findings 2025)
Copy Citation:
PDF:
https://preview.aclanthology.org/author-page-yu-wang-polytechnic/2025.findings-emnlp.193.pdf
Checklist:
 2025.findings-emnlp.193.checklist.pdf