Jianxin Zhang
2026
Beyond Task-Level Context: Class-Conditional Context Vectors for Implicit In-Context Learning
Jianxin Zhang | Yilu Hu | Teng Liu | Pei Guo | Juntao Li
Findings of the Association for Computational Linguistics: ACL 2026
Jianxin Zhang | Yilu Hu | Teng Liu | Pei Guo | Juntao Li
Findings of the Association for Computational Linguistics: ACL 2026
Implicit In-Context Learning compresses demonstration examples into a single context vector and injects it into the model’s activation space, achieving few-shot-level performance at near zero-shot inference cost. However, existing approaches typically aggregate demonstrations from all classes into a shared, task-level context vector, capturing global task information but without explicitly preserving fine-grained, class-conditional semantic distinctions. In this work, we propose Class-Conditional Context Vectors (C3V), a simple yet effective extension to implicit in-context learning that explicitly models class-specific contextual information by constructing separate context vectors for each class, without relying on explicit prompts. These class-conditional context vectors are additively injected into the model’s residual streams in a single forward pass, enabling contextual contributions to be modulated in a class-aware manner while keeping the backbone frozen. We evaluate C3V on multiple text classification benchmarks across several families of large language models. Experimental results demonstrate that C3V consistently outperforms task-level context vector baselines, and achieves higher average accuracy than conventional few-shot learning on most models.
2025
ALW: Adaptive Layer-Wise contrastive decoding enhancing reasoning ability in Large Language Models
Yuechi Zhou | Chuyue Zhou | Jianxin Zhang | Juntao Li | Min Zhang
Findings of the Association for Computational Linguistics: ACL 2025
Yuechi Zhou | Chuyue Zhou | Jianxin Zhang | Juntao Li | Min Zhang
Findings of the Association for Computational Linguistics: ACL 2025
Large language models (LLMs) have achieved remarkable performance across various reasoning tasks. However, many LLMs still encounter challenges in reasoning, especially for LLMs with fewer parameters or insufficient pre-training data. Through our experiments, we identify that noise accumulation across layers often leads to unstable token predictions during reasoning. We find that contrasting the probability distributions across layers effectively mitigates this interference. Building on this insight, we propose Adaptive Layer-Wise contrastive decoding (ALW), a novel framework that enhances reasoning ability by dynamically disentangling noise in shallow layers from critical signals in deep layers. Extensive experiments on several reasoning benchmarks demonstrate that ALW consistently improves answer accuracy across multiple LLMs while maintaining inference efficiency. For example, we achieve a 48% improvement on the Gsm8k using the LLaMA-7B model and an absolute accuracy increase of 5.2 points on the BBH evaluation benchmark with the LLaMA-65B model.