SP3: Enhancing Structured Pruning via PCA Projection

Yuxuan Hu, Jing Zhang, Zhe Zhao, Chen Zhao, Xiaodong Chen, Cuiping Li, Hong Chen


Abstract
Structured pruning is a widely used technique for reducing the size of pre-trained language models (PLMs), but current methods often overlook the potential of compressing the hidden dimension d in PLMs, a dimension critical to model size and efficiency. This paper introduces a novel structured pruning approach, Structured Pruning with PCA Projection ( SP3), targeting the effective reduction of d by projecting features into a space defined by principal components before masking. Extensive experiments on benchmarks (GLUE and SQuAD) show that can reduce d by 70%, compress 94% of the BERTbase model, and maintain over 96% accuracy and outperform other methods that compress d by 6% in accuracy at the same compression ratio. SP3 has also proven effective with other models, including OPT and Llama.Our data and code are available at https://github.com/hyx1999/SP3
Anthology ID:
2024.findings-acl.187
Volume:
Findings of the Association for Computational Linguistics ACL 2024
Month:
August
Year:
2024
Address:
Bangkok, Thailand and virtual meeting
Editors:
Lun-Wei Ku, Andre Martins, Vivek Srikumar
Venue:
Findings
SIG:
Publisher:
Association for Computational Linguistics
Note:
Pages:
3150–3170
Language:
URL:
https://aclanthology.org/2024.findings-acl.187
DOI:
10.18653/v1/2024.findings-acl.187
Bibkey:
Cite (ACL):
Yuxuan Hu, Jing Zhang, Zhe Zhao, Chen Zhao, Xiaodong Chen, Cuiping Li, and Hong Chen. 2024. SP3: Enhancing Structured Pruning via PCA Projection. In Findings of the Association for Computational Linguistics ACL 2024, pages 3150–3170, Bangkok, Thailand and virtual meeting. Association for Computational Linguistics.
Cite (Informal):
SP3: Enhancing Structured Pruning via PCA Projection (Hu et al., Findings 2024)
Copy Citation:
PDF:
https://preview.aclanthology.org/ingest-2024-clasp/2024.findings-acl.187.pdf