Wei Jia
Papers on this page may belong to the following people: Wei Jia, Wei Jia
2026
SADA: Bridging In-Context Learning and Fine-Tuning via State-Aligned Distillation Adapters
Wenhao Gao | Tianlong Wang | Wei Jia | Linhao Zhang | Aiwei Liu | Miao Fan | Zhou Xiao
Proceedings of the 64th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers)
Wenhao Gao | Tianlong Wang | Wei Jia | Linhao Zhang | Aiwei Liu | Miao Fan | Zhou Xiao
Proceedings of the 64th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers)
Prompt-based in-context learning (ICL) and parameter fine-tuning are two dominant paradigms for incorporating external information into large language models (LLMs), but they incur high inference costs or require expensive retraining. To bridge this gap, context-to-parameter mapping converts prompts into temporary adapter weights. However, we identify a critical failure mode in existing methods: *hidden-state collapse*, where the adapter-augmented model’s internal states diverge sharply from the full-context oracle in deeper layers. We trace this failure to two coupled gaps: suboptimal **Input-Selection** and inadequate **Supervision-Signal**. To address these issues, we propose SADA (**S**tate-**A**ligned **D**istillation **A**dapters). We establish the *attention-block output* as a principled feature interface to improve input selection and introduce *state-alignment distillation* to enforce consistency between the adapter-augmented model and the full-context oracle. Experiments on long-context language modeling (PG19) and downstream NLU and summarization benchmarks show that SADA consistently outperforms strong baselines like *StreamAdapter* and *GenerativeAdapter*, achieving performance comparable to ICL while significantly reducing memory footprint and latency. We further analyze when parameterized context compression is effective and when explicit context retention remains preferable. Our code is available at [https://github.com/Taylor-Gavel/SADA.git](https://github.com/Taylor-Gavel/SADA.git).
Beyond Transcription: Unified Audio Schema for Perception-Aware AudioLLMs
Linhao Zhang | Yuhan Song | Aiwei Liu | Chuhan Wu | Sijun Zhang | Wei Jia | Yuan Liu | Houfeng Wang | Zhou Xiao
Findings of the Association for Computational Linguistics: ACL 2026
Linhao Zhang | Yuhan Song | Aiwei Liu | Chuhan Wu | Sijun Zhang | Wei Jia | Yuan Liu | Houfeng Wang | Zhou Xiao
Findings of the Association for Computational Linguistics: ACL 2026
Recent Audio Large Language Models (AudioLLMs) exhibit a striking performance inversion: while excelling at complex reasoning tasks, they consistently underperform on fine-grained acoustic perception. We attribute this gap to a fundamental limitation of ASR-centric training, which provides precise linguistic targets but implicitly teaches models to suppress paralinguistic cues and acoustic events as noise. To address this, we propose Unified Audio Schema (UAS), a holistic and structured supervision framework that organizes audio information into three explicit components—Transcription, Paralinguistics, and Non-linguistic Events—within a unified JSON format. This design achieves comprehensive acoustic coverage without sacrificing the tight audio-text alignment that enables reasoning. We validate the effectiveness of this supervision strategy by applying it to both discrete and continuous AudioLLM architectures. Extensive experiments on MMSU, MMAR, and MMAU demonstrate that UAS-Audio yields consistent improvements, boosting fine-grained perception by 10.9% on MMSU over the same-size state-of-the-art models while preserving robust reasoning capabilities. Our code and model are publicly available at https://github.com/Tencent/Unified_Audio_Schema.
2025
LegalAgentBench: Evaluating LLM Agents in Legal Domain
Haitao Li | Junjie Chen | Jingli Yang | Qingyao Ai | Wei Jia | Youfeng Liu | Kai Lin | Yueyue Wu | Guozhi Yuan | Yiran Hu | Wuyue Wang | Yiqun Liu | Minlie Huang
Proceedings of the 63rd Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers)
Haitao Li | Junjie Chen | Jingli Yang | Qingyao Ai | Wei Jia | Youfeng Liu | Kai Lin | Yueyue Wu | Guozhi Yuan | Yiran Hu | Wuyue Wang | Yiqun Liu | Minlie Huang
Proceedings of the 63rd Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers)
With the increasing intelligence and autonomy of LLM Agents, their potential applications in the legal domain are becoming increasingly apparent. However, existing general-domain benchmarks are unable to fully capture the complexity and subtle nuances inherent in real-world judicial cognition and decision-making. Therefore, we propose LegalAgentBench, a comprehensive benchmark specifically designed to evaluate LLM Agents in the Chinese legal domain. LegalAgentBench includes 17 corpora from real-world legal scenarios and provides 37 tools for interacting with external knowledge. To cover tasks of varying difficulty and types, we designed a scalable task construction process that enables a more precise evaluation of performance in both tool utilization and reasoning. Moreover, Beyond assessing performance through the success rate of final outcomes, LegalAgentBench incorporates keyword analysis during intermediate processes to calculate progress rates, facilitating a more fine-grained evaluation. We evaluated eight popular LLMs, highlighting the strengths, limitations, and potential areas for improvement of existing models and methods. LegalAgentBench sets a new benchmark for the practical application of LLMs in the legal domain, with its code and data available at https://github.com/CSHaitao/LegalAgentBench.
2023
Learning In-context Learning for Named Entity Recognition
Jiawei Chen | Yaojie Lu | Hongyu Lin | Jie Lou | Wei Jia | Dai Dai | Hua Wu | Boxi Cao | Xianpei Han | Le Sun
Proceedings of the 61st Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers)
Jiawei Chen | Yaojie Lu | Hongyu Lin | Jie Lou | Wei Jia | Dai Dai | Hua Wu | Boxi Cao | Xianpei Han | Le Sun
Proceedings of the 61st Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers)
Named entity recognition in real-world applications suffers from the diversity of entity types, the emergence of new entity types, and the lack of high-quality annotations. To address the above problems, this paper proposes an in-context learning-based NER approach, which can effectively inject in-context NER ability into PLMs and recognize entities of novel types on-the-fly using only a few demonstrative instances. Specifically, we model PLMs as a meta-function Lambda_instruction, demonstrations, text.M, and a new entity extractor can be implicitly constructed by applying new instruction and demonstrations to PLMs, i.e., (Lambda . M) (instruction, demonstrations) ->F where F will be a new entity extractor F: text -> entities. To inject the above in-context NER ability into PLMs, we propose a meta-function pre-training algorithm, which pre-trains PLMs by comparing the (instruction, demonstration)-initialized extractor with a surrogate golden extractor. Experimental results on 4 few-shot NER datasets show that our method can effectively inject in-context NER ability into PLMs and significantly outperforms the PLMs+fine-tuning counterparts.
2019
ARNOR: Attention Regularization based Noise Reduction for Distant Supervision Relation Classification
Wei Jia | Dai Dai | Xinyan Xiao | Hua Wu
Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics
Wei Jia | Dai Dai | Xinyan Xiao | Hua Wu
Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics
Distant supervision is widely used in relation classification in order to create large-scale training data by aligning a knowledge base with an unlabeled corpus. However, it also introduces amounts of noisy labels where a contextual sentence actually does not express the labeled relation. In this paper, we propose ARNOR, a novel Attention Regularization based NOise Reduction framework for distant supervision relation classification. ARNOR assumes that a trustable relation label should be explained by the neural attention model. Specifically, our ARNOR framework iteratively learns an interpretable model and utilizes it to select trustable instances. We first introduce attention regularization to force the model to pay attention to the patterns which explain the relation labels, so as to make the model more interpretable. Then, if the learned model can clearly locate the relation patterns of a candidate instance in the training set, we will select it as a trustable instance for further training step. According to the experiments on NYT data, our ARNOR framework achieves significant improvements over state-of-the-art methods in both relation classification performance and noise reduction effect.
Search
Fix author
Co-authors
- Dai Dai 2
- Aiwei Liu 2
- Hua Wu (吴华) 2
- Zhou Xiao 2
- Linhao Zhang 2
- Qingyao Ai 1
- Boxi Cao 1
- Jiawei Chen 1
- Junjie Chen 1
- Miao Fan 1
- Wenhao Gao 1
- Yiran HU 1
- Xianpei Han 1
- Minlie Huang 1
- Haitao Li 1
- Hongyu Lin 1
- Kai Lin 1
- Yuan Liu 1
- Youfeng Liu 1
- Yiqun Liu 1
- Jie Lou 1
- Yaojie Lu 1
- Yuhan Song 1
- Le Sun 1
- Tianlong Wang 1
- Houfeng Wang 1
- Wuyue Wang 1
- Chuhan Wu 1
- Yueyue Wu 1
- Xinyan Xiao 1
- Jingli Yang 1
- Guozhi Yuan 1
- Sijun Zhang 1