Wei-Jie Xu
2025
PIP: Perturbation-based Iterative Pruning for Large Language Models
Yi Cao
|
Wei-Jie Xu
|
Yucheng Shen
|
Weijie Shi
|
Chi-Min Chan
|
Jianfeng Qu
|
Jiajie Xu
Findings of the Association for Computational Linguistics: EMNLP 2025
The rapid increase in the parameter counts of Large Language Models (LLMs), which often reach into the billions or even trillions, presents significant challenges for their practical deployment, particularly in resource-constrained environments. To address this issue, we propose PIP (Perturbation-based Iterative Pruning), a novel double-view structured pruning method to optimize LLMs, which combines information from two different views: the unperturbed view and the perturbed view. With the calculation of gradient differences, PIP iteratively prunes those that struggle to distinguish between these two views. Our experiments show that PIP reduces the parameter count by approximately 20% while retaining over 85% of the original model’s accuracy across varied benchmarks. In some cases, the performance of the pruned model is within 5% of the unpruned version, demonstrating PIP’s ability to preserve key aspects of model effectiveness. Moreover, PIP consistently outperforms existing state-of-the-art (SOTA) structured pruning methods, establishing it as a leading technique for optimizing LLMs in constrained environments.
Search
Fix author
Co-authors
- Yi Cao 1
- Chi-Min Chan 1
- Jianfeng Qu 1
- Yucheng Shen 1
- Weijie Shi 1
- show all...