Stronger Baselines for Retrieval-Augmented Generation with Long-Context Language Models

Alex Laitenberger; Christopher D. Manning; Nelson F. Liu

Stronger Baselines for Retrieval-Augmented Generation with Long-Context Language Models

Alex Laitenberger, Christopher D Manning, Nelson F. Liu

Abstract

With the rise of long-context language models (LMs) capable of processing tens of thousands of tokens in a single context window, do multi-stage retrieval-augmented generation (RAG) pipelines still offer measurable benefits over simpler, single-stage approaches? To assess this question, we conduct a controlled evaluation for QA tasks under systematically scaled token budgets, comparing two recent multi-stage pipelines, ReadAgent and RAPTOR, against three baselines, including DOS RAG (Document’s Original Structure RAG), a simple retrieve-then-read method that preserves original passage order. Despite its straightforward design, DOS RAG consistently matches or outperforms more intricate methods on multiple long-context QA benchmarks. We trace this strength to a combination of maintaining source fidelity and document structure, prioritizing recall within effective context windows, and favoring simplicity over added pipeline complexity. We recommend establishing DOS RAG as a simple yet strong baseline for future RAG evaluations, paired with state-of-the-art embedding and language models, and benchmarked under matched token budgets, to ensure that added pipeline complexity is justified by clear performance gains as models continue to improve.

Anthology ID:: 2025.emnlp-main.1656
Volume:: Proceedings of the 2025 Conference on Empirical Methods in Natural Language Processing
Month:: November
Year:: 2025
Address:: Suzhou, China
Editors:: Christos Christodoulopoulos, Tanmoy Chakraborty, Carolyn Rose, Violet Peng
Venue:: EMNLP
SIG:
Publisher:: Association for Computational Linguistics
Note:
Pages:: 32547–32557
Language:
URL:: https://preview.aclanthology.org/ingest-emnlp/2025.emnlp-main.1656/
DOI:
Bibkey:
Cite (ACL):: Alex Laitenberger, Christopher D Manning, and Nelson F. Liu. 2025. Stronger Baselines for Retrieval-Augmented Generation with Long-Context Language Models. In Proceedings of the 2025 Conference on Empirical Methods in Natural Language Processing, pages 32547–32557, Suzhou, China. Association for Computational Linguistics.
Cite (Informal):: Stronger Baselines for Retrieval-Augmented Generation with Long-Context Language Models (Laitenberger et al., EMNLP 2025)
Copy Citation:
PDF:: https://preview.aclanthology.org/ingest-emnlp/2025.emnlp-main.1656.pdf
Checklist:: 2025.emnlp-main.1656.checklist.pdf

PDF Cite Search Checklist Fix data