Separate the Wheat from the Chaff: Winnowing Down Divergent Views in Retrieval Augmented Generation

Song Wang, Zihan Chen, Peng Wang, Zhepei Wei, Zhen Tan, Yu Meng, Cong Shen, Jundong Li


Abstract
Retrieval-augmented generation (RAG) addresses the limitation of large language models (LLMs) in achieving up-to-date information by integrating external knowledge sources, but it is hindered by noisy or irrelevant retrieved data, leading to reduced accuracy. Additionally, most RAG methods rely on task-specific supervision, reducing their adaptability across domains. To overcome these challenges, we propose WinnowRAG, a novel multi-agent debate-based RAG framework. WinnowRAG operates in two stages: in Stage I, query-aware clustering groups similar documents, with each cluster assigned to an LLM agent for generating personalized responses. A critic LLM then consolidates these answers, forming super-agents. In Stage II, the super-agents engage in a structured discussion to filter out incorrect or irrelevant information, ensuring only relevant knowledge is used for final response generation. Crucially, WinnowRAG is unsupervised and leverages pretrained LLMs without requiring fine-tuning, making it easily adaptable to various tasks. The experiments on various realistic datasets demonstrate the effectiveness of WinnowRAG over state-of-the-art baselines.
Anthology ID:
2025.emnlp-main.587
Volume:
Proceedings of the 2025 Conference on Empirical Methods in Natural Language Processing
Month:
November
Year:
2025
Address:
Suzhou, China
Editors:
Christos Christodoulopoulos, Tanmoy Chakraborty, Carolyn Rose, Violet Peng
Venue:
EMNLP
SIG:
Publisher:
Association for Computational Linguistics
Note:
Pages:
11626–11642
Language:
URL:
https://preview.aclanthology.org/ingest-emnlp/2025.emnlp-main.587/
DOI:
Bibkey:
Cite (ACL):
Song Wang, Zihan Chen, Peng Wang, Zhepei Wei, Zhen Tan, Yu Meng, Cong Shen, and Jundong Li. 2025. Separate the Wheat from the Chaff: Winnowing Down Divergent Views in Retrieval Augmented Generation. In Proceedings of the 2025 Conference on Empirical Methods in Natural Language Processing, pages 11626–11642, Suzhou, China. Association for Computational Linguistics.
Cite (Informal):
Separate the Wheat from the Chaff: Winnowing Down Divergent Views in Retrieval Augmented Generation (Wang et al., EMNLP 2025)
Copy Citation:
PDF:
https://preview.aclanthology.org/ingest-emnlp/2025.emnlp-main.587.pdf
Checklist:
 2025.emnlp-main.587.checklist.pdf