Context-Efficient Retrieval with Factual Decomposition

Yanhong Li, David Yunis, David McAllester, Jiawei Zhou


Abstract
There has recently been considerable interest in incorporating information retrieval into large language models (LLMs). Retrieval from a dynamically expanding external corpus of text allows a model to incorporate current events and can be viewed as a form of episodic memory. Here we demonstrate that pre-processing the external corpus into semi-structured “atomic facts” makes retrieval more efficient. More specifically, we demonstrate that our particular form of atomic facts improves performance on various question answering tasks when the amount of retrieved text is limited. Limiting the amount of retrieval reduces the size of the context and improves inference efficiency.
Anthology ID:
2025.naacl-short.16
Volume:
Proceedings of the 2025 Conference of the Nations of the Americas Chapter of the Association for Computational Linguistics: Human Language Technologies (Volume 2: Short Papers)
Month:
April
Year:
2025
Address:
Albuquerque, New Mexico
Editors:
Luis Chiruzzo, Alan Ritter, Lu Wang
Venue:
NAACL
SIG:
Publisher:
Association for Computational Linguistics
Note:
Pages:
178–194
Language:
URL:
https://preview.aclanthology.org/fix-sig-urls/2025.naacl-short.16/
DOI:
Bibkey:
Cite (ACL):
Yanhong Li, David Yunis, David McAllester, and Jiawei Zhou. 2025. Context-Efficient Retrieval with Factual Decomposition. In Proceedings of the 2025 Conference of the Nations of the Americas Chapter of the Association for Computational Linguistics: Human Language Technologies (Volume 2: Short Papers), pages 178–194, Albuquerque, New Mexico. Association for Computational Linguistics.
Cite (Informal):
Context-Efficient Retrieval with Factual Decomposition (Li et al., NAACL 2025)
Copy Citation:
PDF:
https://preview.aclanthology.org/fix-sig-urls/2025.naacl-short.16.pdf