Junzhe Zhou
2026
AED-RAG: Continuous Multi-Granular Context Fusion for Retrieval-Augmented Generation via Adaptive Ensemble Decoding
Junzhe Zhou | Fulin Lin | Tairan Cheng | Shaowen Chen | Hongwei Wang
Findings of the Association for Computational Linguistics: ACL 2026
Junzhe Zhou | Fulin Lin | Tairan Cheng | Shaowen Chen | Hongwei Wang
Findings of the Association for Computational Linguistics: ACL 2026
Retrieval-Augmented Generation (RAG) enhances Large Language Models (LLMs) yet suffers from a mismatch between coarse retrieval granularity and fine-grained generation needs. Specifically, coarse-grained passages inherently conflate valid context with intra-passage noise that semantic retrieval often fails to filter. Existing alignment strategies, typically relying on discrete reranking, struggle to address this granularity mismatch or effectively balance external evidence with internal knowledge. To bridge this gap, we propose **AED-RAG**, a framework that synergizes discrete retrieval with continuous **A**daptive **E**nsemble **D**ecoding. Specifically, we fine-tune a utility predictor using contrastive perplexity to discern the information density differences between unstructured narrative passages and structured knowledge triplets. During inference, this predictor projects passages, triplets, and the model’s parametric memory into a unified probability space, enabling a soft, token-level fusion that dynamically optimizes information gain. Extensive experiments on four open-domain QA benchmarks demonstrate that AED-RAG significantly outperforms competitive baselines, underscoring the effectiveness of integrating multi-granular contexts.