SADA: Bridging In-Context Learning and Fine-Tuning via State-Aligned Distillation Adapters

Wenhao Gao; Tianlong Wang; Wei Jia; Linhao Zhang; Aiwei Liu; Miao Fan; Zhou Xiao

SADA: Bridging In-Context Learning and Fine-Tuning via State-Aligned Distillation Adapters

Wenhao Gao, Tianlong Wang, Wei Jia, Linhao Zhang, Aiwei Liu, Miao Fan, Zhou Xiao

Abstract

Prompt-based in-context learning (ICL) and parameter fine-tuning are two dominant paradigms for incorporating external information into large language models (LLMs), but they incur high inference costs or require expensive retraining. To bridge this gap, context-to-parameter mapping converts prompts into temporary adapter weights. However, we identify a critical failure mode in existing methods: *hidden-state collapse*, where the adapter-augmented model’s internal states diverge sharply from the full-context oracle in deeper layers. We trace this failure to two coupled gaps: suboptimal **Input-Selection** and inadequate **Supervision-Signal**. To address these issues, we propose SADA (**S**tate-**A**ligned **D**istillation **A**dapters). We establish the *attention-block output* as a principled feature interface to improve input selection and introduce *state-alignment distillation* to enforce consistency between the adapter-augmented model and the full-context oracle. Experiments on long-context language modeling (PG19) and downstream NLU and summarization benchmarks show that SADA consistently outperforms strong baselines like *StreamAdapter* and *GenerativeAdapter*, achieving performance comparable to ICL while significantly reducing memory footprint and latency. We further analyze when parameterized context compression is effective and when explicit context retention remains preferable. Our code is available at [https://github.com/Taylor-Gavel/SADA.git](https://github.com/Taylor-Gavel/SADA.git).

Anthology ID:: 2026.acl-long.1046
Volume:: Proceedings of the 64th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers)
Month:: July
Year:: 2026
Address:: San Diego, California, United States
Editors:: Maria Liakata, Viviane P. Moreira, Jiajun Zhang, David Jurgens
Venue:: ACL
SIG:
Publisher:: Association for Computational Linguistics
Note:
Pages:: 22847–22862
Language:
URL:: https://preview.aclanthology.org/ingest-acl/2026.acl-long.1046/
DOI:
Bibkey:
Cite (ACL):: Wenhao Gao, Tianlong Wang, Wei Jia, Linhao Zhang, Aiwei Liu, Miao Fan, and Zhou Xiao. 2026. SADA: Bridging In-Context Learning and Fine-Tuning via State-Aligned Distillation Adapters. In Proceedings of the 64th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), pages 22847–22862, San Diego, California, United States. Association for Computational Linguistics.
Cite (Informal):: SADA: Bridging In-Context Learning and Fine-Tuning via State-Aligned Distillation Adapters (Gao et al., ACL 2026)
Copy Citation:
PDF:: https://preview.aclanthology.org/ingest-acl/2026.acl-long.1046.pdf
Checklist:: 2026.acl-long.1046.checklist.pdf

PDF Cite Search Checklist Fix data