Layer-wise Model Pruning based on Mutual Information
Chun Fan, Jiwei Li, Tianwei Zhang, Xiang Ao, Fei Wu, Yuxian Meng, Xiaofei Sun
Abstract
Inspired by mutual information (MI) based feature selection in SVMs and logistic regression, in this paper, we propose MI-based layer-wise pruning: for each layer of a multi-layer neural network, neurons with higher values of MI with respect to preserved neurons in the upper layer are preserved. Starting from the top softmax layer, layer-wise pruning proceeds in a top-down fashion until reaching the bottom word embedding layer. The proposed pruning strategy offers merits over weight-based pruning techniques: (1) it avoids irregular memory access since representations and matrices can be squeezed into their smaller but dense counterparts, leading to greater speedup; (2) in a manner of top-down pruning, the proposed method operates from a more global perspective based on training signals in the top layer, and prunes each layer by propagating the effect of global signals through layers, leading to better performances at the same sparsity level. Extensive experiments show that at the same sparsity level, the proposed strategy offers both greater speedup and higher performances than weight-based pruning methods (e.g., magnitude pruning, movement pruning).- Anthology ID:
- 2021.emnlp-main.246
- Volume:
- Proceedings of the 2021 Conference on Empirical Methods in Natural Language Processing
- Month:
- November
- Year:
- 2021
- Address:
- Online and Punta Cana, Dominican Republic
- Editors:
- Marie-Francine Moens, Xuanjing Huang, Lucia Specia, Scott Wen-tau Yih
- Venue:
- EMNLP
- SIG:
- Publisher:
- Association for Computational Linguistics
- Note:
- Pages:
- 3079–3090
- Language:
- URL:
- https://preview.aclanthology.org/sigedu-bea-out-of-sync-correction/2021.emnlp-main.246/
- DOI:
- 10.18653/v1/2021.emnlp-main.246
- Cite (ACL):
- Chun Fan, Jiwei Li, Tianwei Zhang, Xiang Ao, Fei Wu, Yuxian Meng, and Xiaofei Sun. 2021. Layer-wise Model Pruning based on Mutual Information. In Proceedings of the 2021 Conference on Empirical Methods in Natural Language Processing, pages 3079–3090, Online and Punta Cana, Dominican Republic. Association for Computational Linguistics.
- Cite (Informal):
- Layer-wise Model Pruning based on Mutual Information (Fan et al., EMNLP 2021)
- PDF:
- https://preview.aclanthology.org/sigedu-bea-out-of-sync-correction/2021.emnlp-main.246.pdf