Embedding-Informed Adaptive Retrieval-Augmented Generation of Large Language Models

Chengkai Huang; Yu Xia; Rui Wang; Kaige Xie; Tong Yu; Julian McAuley; Lina Yao

Embedding-Informed Adaptive Retrieval-Augmented Generation of Large Language Models

Chengkai Huang, Yu Xia, Rui Wang, Kaige Xie, Tong Yu, Julian McAuley, Lina Yao

Abstract

Retrieval-augmented large language models (LLMs) have been remarkably competent in various NLP tasks. However, it was observed by previous works that retrieval is not always helpful, especially when the LLM is already knowledgable on the query to answer. Motivated by this, Adaptive Retrieval-Augmented Generation (ARAG) studies retrieving only when the knowledge asked by the query is absent in the LLM. Previous works of ARAG either require accessing the pre-training corpus or prompting with additional model inferences. Aiming to avoid such drawbacks, we propose to determine whether the model is knowledgeable on a query via inspecting the (contextualized) pre-trained token embeddings of LLMs. We hypothesize that such embeddings capture rich information on the model’s intrinsic knowledge base, which enables an efficient way of judging the necessity to retrieve from an external corpus. Extensive experiments demonstrate our ARAG approach’s superior performance across various benchmarks.

Anthology ID:: 2025.coling-main.94
Volume:: Proceedings of the 31st International Conference on Computational Linguistics
Month:: January
Year:: 2025
Address:: Abu Dhabi, UAE
Editors:: Owen Rambow, Leo Wanner, Marianna Apidianaki, Hend Al-Khalifa, Barbara Di Eugenio, Steven Schockaert
Venue:: COLING
SIG:
Publisher:: Association for Computational Linguistics
Note:
Pages:: 1403–1412
Language:
URL:: https://preview.aclanthology.org/fix-sig-urls/2025.coling-main.94/
DOI:
Bibkey:
Cite (ACL):: Chengkai Huang, Yu Xia, Rui Wang, Kaige Xie, Tong Yu, Julian McAuley, and Lina Yao. 2025. Embedding-Informed Adaptive Retrieval-Augmented Generation of Large Language Models. In Proceedings of the 31st International Conference on Computational Linguistics, pages 1403–1412, Abu Dhabi, UAE. Association for Computational Linguistics.
Cite (Informal):: Embedding-Informed Adaptive Retrieval-Augmented Generation of Large Language Models (Huang et al., COLING 2025)
Copy Citation:
PDF:: https://preview.aclanthology.org/fix-sig-urls/2025.coling-main.94.pdf

PDF Cite Search Fix data