Max Schettewi


2025

pdf bib
Embedding-Free RAG
Jessica Maghakian | Raunak Sinha | Max Schettewi | Gunkirat Kaur
Findings of the Association for Computational Linguistics: EMNLP 2025

Retrieval-Augmented Generation (RAG) is the current state-of-the-art method for mitigating the shortcomings of large language models (LLMs) by incorporating external knowledge sources to provide more relevant and accurate responses to user queries. However building performant RAG systems for real use-cases typically requires heavy investment from NLP experts, such as fine-tuning embedding models for specialized domains, experimenting with text chunking strategies and other niche hyperparameter tunings. We propose Embedding-Free RAG, a model-agnostic approach that enables the deployment of a one-size-fits-all RAG pipeline for user-provided grounding documents. Unlike traditional RAG, which relies on embedding models for information retrieval, Embedding-Free RAG leverages the generalized reasoning abilities of LLMs in a novel algorithmic framework during the retrieval stage. Extensive experiments demonstrate that Embedding-Free RAG outperforms existing state-of-the-art methods, achieving up to 4.6x higher F1 scores and up to 2x better question answering accuracy across a wide range of challenging domains.