Junho Han

2026

ConvX: A Lightweight Converter to Bridge Indexed Dense Representations and Large Language Models for Retrieval-Augmented Generation
Bonggeun Choi | Keunha Kim | Junho Han | Youngjoong Ko
Findings of the Association for Computational Linguistics: ACL 2026

Retrieval-Augmented Generation (RAG) has significantly advanced open-domain question answering systems by incorporating external knowledge into large language models. Despite its effectiveness, existing RAG pipelines suffer from critical efficiency limitations. In particular, modern transformer-based generators exhibit quadratic or higher computational complexity with respect to input sequence length and hidden dimensionality, leading to substantial inference latency as model scales and contextual inputs increase. This issue is exacerbated in RAG settings, where retrieved contexts substantially expand the input prompt. To alleviate this challenge, we propose an effective compression-based RAG framework, ConvX, that directly leverages indexed dense representations produced by a retriever, entirely substituting to long text contexts. Our approach expands a single dense representation into a fixed number of memory slots using a lightweight converter to provide rich lexical information. This design enables efficient knowledge integration while significantly reducing input length and computational overhead. Empirical evaluations demonstrate that the proposed model achieves competitive performances compared to the existing state-of-the-art model that uses a large ad-hoc context compressor, while offering substantially improved inference efficiency.

Co-authors

Venues

Findings1

Fix author