Jing Xiao
Other people with similar names: Jing Xiao
Unverified author pages with similar names: Jing Xiao
2026
Astra: Activation-Space Tail-Eigenvector Low-Rank Adaptation of Large Language Models
Kainan Liu | Yong Zhang | Ning Cheng | Yun Zhu | Yanmeng Wang | Shaojun Wang | Jing Xiao
Findings of the Association for Computational Linguistics: ACL 2026
Kainan Liu | Yong Zhang | Ning Cheng | Yun Zhu | Yanmeng Wang | Shaojun Wang | Jing Xiao
Findings of the Association for Computational Linguistics: ACL 2026
Parameter-Efficient Fine-Tuning (PEFT) methods, especially LoRA, are widely used for adapting pre-trained models to downstream tasks due to their computational and storage efficiency. However, in the context of LoRA and its variants, the potential of activation subspaces corresponding to tail eigenvectors remains substantially under-exploited, which may lead to suboptimal fine-tuning performance. In this work, we propose Astra (Activation-Space Tail-Eigenvector Low-Rank Adaptation), a novel PEFT method that leverages the tail eigenvectors of the model output activations—estimated from a small task-specific calibration set—to construct task-adaptive low-rank adapters. By constraining updates to the subspace spanned by these tail eigenvectors, Astra achieves faster convergence and improved downstream performance with a significantly reduced parameter budget. Extensive experiments across natural language understanding (NLU) and natural language generation (NLG) tasks demonstrate that Astra consistently outperforms existing PEFT baselines across 16 benchmarks and even surpasses full fine-tuning (FFT) in certain scenarios.
2025
Dynamic Attention-Guided Context Decoding for Mitigating Context Faithfulness Hallucinations in Large Language Models
Yanwen Huang | Yong Zhang | Ning Cheng | Zhitao Li | Shaojun Wang | Jing Xiao
Findings of the Association for Computational Linguistics: ACL 2025
Yanwen Huang | Yong Zhang | Ning Cheng | Zhitao Li | Shaojun Wang | Jing Xiao
Findings of the Association for Computational Linguistics: ACL 2025
Large language models (LLMs) often exhibit Context Faithfulness Hallucinations, where outputs deviate from retrieved information due to incomplete context integration. Our analysis reveals a strong correlation between token-level uncertainty and hallucinations. We hypothesize that attention mechanisms inherently encode context utilization signals, supported by probing analysis. Based on these insights, we propose Dynamic Attention-Guided Context Decoding (DAGCD), a lightweight framework that leverages attention distributions and uncertainty signals in a single-pass decoding. Experiments on open-book QA datasets demonstrate DAGCD’s effectiveness, yielding significant improvements in faithfulness and robustness while preserving computational efficiency.
GRASP: Replace Redundant Layers with Adaptive Singular Parameters for Efficient Model Compression
Kainan Liu | Yong Zhang | Ning Cheng | Zhitao Li | Shaojun Wang | Jing Xiao
Proceedings of the 2025 Conference on Empirical Methods in Natural Language Processing
Kainan Liu | Yong Zhang | Ning Cheng | Zhitao Li | Shaojun Wang | Jing Xiao
Proceedings of the 2025 Conference on Empirical Methods in Natural Language Processing
Recent studies have demonstrated that many layers are functionally redundant in large language models (LLMs), enabling model compression by removing these layers to reduce inference cost. While such approaches can improve efficiency, indiscriminate layer pruning often results in significant performance degradation. In this paper, we propose **GRASP** (**G**radient-based **R**etention of **A**daptive **S**ingular **P**arameters), a novel compression framework that mitigates this issue by preserving sensitivity-aware singular values. Unlike direct layer pruning, GRASP leverages gradient-based attribution on a small calibration dataset to adaptively identify and retain critical singular components. By replacing redundant layers with only a minimal set of parameters, GRASP achieves efficient compression while maintaining strong performance with minimal overhead. Experiments across multiple LLMs show that GRASP consistently outperforms existing compression methods, achieving 90% of the original model’s performance under a 20% compression ratio.
ChatSOP: An SOP-Guided MCTS Planning Framework for Controllable LLM Dialogue Agents
Zhigen Li | Jianxiang Peng | Yanmeng Wang | Yong Cao | Tianhao Shen | Minghui Zhang | Linxi Su | Shang Wu | Yihang Wu | YuQian Wang | Ye Wang | Wei Hu | Jianfeng Li | Shaojun Wang | Jing Xiao | Deyi Xiong
Proceedings of the 63rd Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers)
Zhigen Li | Jianxiang Peng | Yanmeng Wang | Yong Cao | Tianhao Shen | Minghui Zhang | Linxi Su | Shang Wu | Yihang Wu | YuQian Wang | Ye Wang | Wei Hu | Jianfeng Li | Shaojun Wang | Jing Xiao | Deyi Xiong
Proceedings of the 63rd Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers)
Dialogue agents powered by Large Language Models (LLMs) show superior performance in various tasks. Despite the better user understanding and human-like responses, their **lack of controllability** remains a key challenge, often leading to unfocused conversations or task failure. To address this, we introduce Standard Operating Procedure (SOP) to regulate dialogue flow. Specifically, we propose **ChatSOP**, a novel SOP-guided Monte Carlo Tree Search (MCTS) planning framework designed to enhance the controllability of LLM-driven dialogue agents. To enable this, we curate a dataset comprising SOP-annotated multi-scenario dialogues, generated using a semi-automated role-playing system with GPT-4o and validated through strict manual quality control. Additionally, we propose a novel method that integrates Chain of Thought reasoning with supervised fine-tuning for SOP prediction and utilizes SOP-guided Monte Carlo Tree Search for optimal action planning during dialogues. Experimental results demonstrate the effectiveness of our method, such as achieving a 27.95% improvement in action accuracy compared to baseline models based on GPT-3.5 and also showing notable gains for open-source models. Dataset and codes are publicly available.
2024
IDEAW: Robust Neural Audio Watermarking with Invertible Dual-Embedding
Pengcheng Li | Xulong Zhang | Jing Xiao | Jianzong Wang
Proceedings of the 2024 Conference on Empirical Methods in Natural Language Processing
Pengcheng Li | Xulong Zhang | Jing Xiao | Jianzong Wang
Proceedings of the 2024 Conference on Empirical Methods in Natural Language Processing
The audio watermarking technique embeds messages into audio and accurately extracts messages from the watermarked audio. Traditional methods develop algorithms based on expert experience to embed watermarks into the time-domain or transform-domain of signals. With the development of deep neural networks, deep learning-based neural audio watermarking has emerged. Compared to traditional algorithms, neural audio watermarking achieves better robustness by considering various attacks during training. However, current neural watermarking methods suffer from low capacity and unsatisfactory imperceptibility. Additionally, the issue of watermark locating, which is extremely important and even more pronounced in neural audio water- marking, has not been adequately studied. In this paper, we design a dual-embedding wa- termarking model for efficient locating. We also consider the impact of the attack layer on the invertible neural network in robustness training, improving the model to enhance both its reasonableness and stability. Experiments show that the proposed model, IDEAW, can withstand various attacks with higher capacity and more efficient locating ability compared to existing methods.