Enabling Lightweight Fine-tuning for Pre-trained Language Model Compression based on Matrix Product Operators
Peiyu Liu, Ze-Feng Gao, Wayne Xin Zhao, Zhi-Yuan Xie, Zhong-Yi Lu, Ji-Rong Wen
Abstract
This paper presents a novel pre-trained language models (PLM) compression approach based on the matrix product operator (short as MPO) from quantum many-body physics. It can decompose an original matrix into central tensors (containing the core information) and auxiliary tensors (with only a small proportion of parameters). With the decomposed MPO structure, we propose a novel fine-tuning strategy by only updating the parameters from the auxiliary tensors, and design an optimization algorithm for MPO-based approximation over stacked network architectures. Our approach can be applied to the original or the compressed PLMs in a general way, which derives a lighter network and significantly reduces the parameters to be fine-tuned. Extensive experiments have demonstrated the effectiveness of the proposed approach in model compression, especially the reduction in fine-tuning parameters (91% reduction on average). The code to reproduce the results of this paper can be found at https://github.com/RUCAIBox/MPOP.- Anthology ID:
- 2021.acl-long.418
- Volume:
- Proceedings of the 59th Annual Meeting of the Association for Computational Linguistics and the 11th International Joint Conference on Natural Language Processing (Volume 1: Long Papers)
- Month:
- August
- Year:
- 2021
- Address:
- Online
- Editors:
- Chengqing Zong, Fei Xia, Wenjie Li, Roberto Navigli
- Venues:
- ACL | IJCNLP
- SIG:
- Publisher:
- Association for Computational Linguistics
- Note:
- Pages:
- 5388–5398
- Language:
- URL:
- https://preview.aclanthology.org/icon-24-ingestion/2021.acl-long.418/
- DOI:
- 10.18653/v1/2021.acl-long.418
- Cite (ACL):
- Peiyu Liu, Ze-Feng Gao, Wayne Xin Zhao, Zhi-Yuan Xie, Zhong-Yi Lu, and Ji-Rong Wen. 2021. Enabling Lightweight Fine-tuning for Pre-trained Language Model Compression based on Matrix Product Operators. In Proceedings of the 59th Annual Meeting of the Association for Computational Linguistics and the 11th International Joint Conference on Natural Language Processing (Volume 1: Long Papers), pages 5388–5398, Online. Association for Computational Linguistics.
- Cite (Informal):
- Enabling Lightweight Fine-tuning for Pre-trained Language Model Compression based on Matrix Product Operators (Liu et al., ACL-IJCNLP 2021)
- PDF:
- https://preview.aclanthology.org/icon-24-ingestion/2021.acl-long.418.pdf
- Code
- RUCAIBox/MPOP
- Data
- GLUE, SST, SST-2