2025
pdf
bib
abs
P2 Law: Scaling Law for Post-Training After Model Pruning
Xiaodong Chen
|
Yuxuan Hu
|
Xiaokang Zhang
|
Yanling Wang
|
Cuiping Li
|
Hong Chen
|
Jing Zhang
Proceedings of the 63rd Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers)
Pruning has become a widely adopted technique for reducing the hardware requirements of large language models (LLMs). To recover model performance after pruning, post-training is commonly employed to mitigate the resulting performance degradation. While post-training benefits from larger datasets, once the dataset size is already substantial, increasing the training data provides only limited performance gains. To balance post-training cost and model performance, it is necessary to explore the optimal amount of post-training data. Through extensive experiments on the Llama-3 and Qwen-2.5 series models, pruned using various common pruning methods, we uncover the scaling Law for Post-training after model Pruning, referred to as the P2 Law. This law identifies four key factors for predicting the pruned model’s post-training loss: the model size before pruning, the number of post-training tokens, the pruning rate, and the model’s loss before pruning. Moreover, P2 Law can generalize to larger dataset sizes, larger model sizes, and higher pruning rates, offering valuable insights for the post-training of pruned LLMs.
pdf
bib
abs
SAM Decoding: Speculative Decoding via Suffix Automaton
Yuxuan Hu
|
Ke Wang
|
Xiaokang Zhang
|
Fanjin Zhang
|
Cuiping Li
|
Hong Chen
|
Jing Zhang
Proceedings of the 63rd Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers)
Speculative decoding (SD) has been demonstrated as an effective technique for lossless LLM inference acceleration.Retrieval-based SD methods, one kind of model-free method, have yielded promising speedup, but they often rely on single retrieval resources, inefficient retrieval methods, and are constrained to certain tasks. This paper presents a novel retrieval-based speculative decoding method that adapts the suffix automaton (SAM) for efficient and accurate draft generation by utilizing the generating text sequence and static text corpus. Unlike existing n-gram matching methods, SAM-Decoding finds the exact longest suffix match, achieving an average time complexity of O(1) per generation step of SAM update and suffix retrieval.It can also integrate with existing methods, adaptively selecting a draft generation strategy based on match length to generalize to broader domains. Extensive experiments on Spec-Bench show that our method is 18% faster than other retrieval-based SD methods. Additionally, when combined with advanced EAGLE-2, it provides an additional speedup of 3.28% – 11.13% across various-sized LLM backbones.
pdf
bib
abs
SLAM-Omni: Timbre-Controllable Voice Interaction System with Single-Stage Training
Wenxi Chen
|
Ziyang Ma
|
Ruiqi Yan
|
Yuzhe Liang
|
Xiquan Li
|
Ruiyang Xu
|
Zhikang Niu
|
Yanqiao Zhu
|
Yifan Yang
|
Zhanxun Liu
|
Kai Yu
|
Yuxuan Hu
|
Jinyu Li
|
Yan Lu
|
Shujie Liu
|
Xie Chen
Findings of the Association for Computational Linguistics: ACL 2025
Recent advancements highlight the potential of end-to-end real-time spoken dialogue systems, showcasing their low latency and high quality. In this paper, we introduce SLAM-Omni, a timbre-controllable, end-to-end voice interaction system with single-stage training. SLAM-Omni achieves zero-shot timbre control by modeling spoken language with semantic tokens and decoupling speaker information to a vocoder. By predicting grouped speech semantic tokens at each step, our method significantly reduces the sequence length of audio tokens, accelerating both training and inference. Additionally, we propose historical text prompting to compress dialogue history, facilitating efficient multi-round interactions. Comprehensive evaluations reveal that SLAM-Omni outperforms prior models of similar scale, requiring only 15 hours of training on 4 GPUs with limited data. Notably, it is the first spoken dialogue system to achieve competitive performance with a single-stage training approach, eliminating the need for pre-training on TTS or ASR tasks. Further experiments validate its multilingual and multi-turn dialogue capabilities on larger datasets.
2024
pdf
bib
SP3: Enhancing Structured Pruning via PCA Projection
Yuxuan Hu
|
Jing Zhang
|
Zhe Zhao
|
Chen Zhao
|
Xiaodong Chen
|
Cuiping Li
|
Hong Chen
Findings of the Association for Computational Linguistics: ACL 2024
2023
pdf
bib
abs
A Generation-based Deductive Method for Math Word Problems
Yuxuan Hu
|
Jing Zhang
|
Haoyang Li
|
Cuiping Li
|
Hong Chen
Proceedings of the 2023 Conference on Empirical Methods in Natural Language Processing
Math word problems (MWP) involving advanced operators such as linear equation solver cannot be easily tackled by earlier MWP methods, because the existing generation methods suffer from repeated sub-expression generation and deductive methods are restricted to dealing with binary operations. This paper propose a new multivariate directed acyclic graph (mDAG) as an alternative to the generation methods’ binary expression tree or the deductive methods’ binary directed acyclic graph. Then to produce the topological ordering of mDAG, we propose a generation-based deductive (GeDe) model, which equips a generation model with a re-encoder to keep the deductive property but avoid the expensive enumeration of the deductive methods. GeDe performs well on math problems with many operators on the widely used benchmarks as well as solving multivariate operators on our own CMWPA benchmark. Our code is available at https://github.com/hyx1999/GeDe
2008
pdf
bib
A Cascaded Syntactic and Semantic Dependency Parsing System
Wanxiang Che
|
Zhenghua Li
|
Yuxuan Hu
|
Yongqiang Li
|
Bing Qin
|
Ting Liu
|
Sheng Li
CoNLL 2008: Proceedings of the Twelfth Conference on Computational Natural Language Learning
2007
pdf
bib
HIT-IR-WSD: A WSD System for English Lexical Sample Task
Yuhang Guo
|
Wanxiang Che
|
Yuxuan Hu
|
Wei Zhang
|
Ting Liu
Proceedings of the Fourth International Workshop on Semantic Evaluations (SemEval-2007)
2005
pdf
bib
Semantic Role Labeling System Using Maximum Entropy Classifier
Ting Liu
|
Wanxiang Che
|
Sheng Li
|
Yuxuan Hu
|
Huaijun Liu
Proceedings of the Ninth Conference on Computational Natural Language Learning (CoNLL-2005)