GainRAG: Preference Alignment in Retrieval-Augmented Generation through Gain Signal Synthesis

Yi Jiang, Sendong Zhao, Jianbo Li, Haochun Wang, Bing Qin


Abstract
The Retrieval-Augmented Generation (RAG) framework introduces a retrieval module to dynamicaslly inject retrieved information into the input context of large language models (LLMs), and has demonstrated significant success in various NLP tasks. However, the current study points out that there is a preference gap between retrievers and LLMs in the RAG framework, which limit the further improvement of system performance. Some highly relevant passages may interfere with LLM reasoning because they contain complex or contradictory information; while some indirectly related or even inaccurate content may help LLM generate more accurate answers by providing suggestive information or logical clues. To solve this, we propose **GainRAG**, a novel approach that aligns the retriever’s and LLM’s preferences by defining a new metric, “gain’’, which measure how well an input passage contributes to correct outputs.We then propose a method to estimate these gain signals and train a middleware that aligns the preferences of the retriever and the LLM using only limited data.In addition, we introduce a pseudo-passage strategy to mitigate degradation.The experimental results on 6 datasets verify the effectiveness of GainRAG.
Anthology ID:
2025.acl-long.527
Volume:
Proceedings of the 63rd Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers)
Month:
July
Year:
2025
Address:
Vienna, Austria
Editors:
Wanxiang Che, Joyce Nabende, Ekaterina Shutova, Mohammad Taher Pilehvar
Venue:
ACL
SIG:
Publisher:
Association for Computational Linguistics
Note:
Pages:
10746–10757
Language:
URL:
https://preview.aclanthology.org/ingestion-acl-25/2025.acl-long.527/
DOI:
Bibkey:
Cite (ACL):
Yi Jiang, Sendong Zhao, Jianbo Li, Haochun Wang, and Bing Qin. 2025. GainRAG: Preference Alignment in Retrieval-Augmented Generation through Gain Signal Synthesis. In Proceedings of the 63rd Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), pages 10746–10757, Vienna, Austria. Association for Computational Linguistics.
Cite (Informal):
GainRAG: Preference Alignment in Retrieval-Augmented Generation through Gain Signal Synthesis (Jiang et al., ACL 2025)
Copy Citation:
PDF:
https://preview.aclanthology.org/ingestion-acl-25/2025.acl-long.527.pdf