Probing Structured Pruning on Multilingual Pre-trained Models: Settings, Algorithms, and Efficiency

Yanyang Li, Fuli Luo, Runxin Xu, Songfang Huang, Fei Huang, Liwei Wang


Abstract
Structured pruning has been extensively studied on monolingual pre-trained language models and is yet to be fully evaluated on their multilingual counterparts. This work investigates three aspects of structured pruning on multilingual pre-trained language models: settings, algorithms, and efficiency. Experiments on nine downstream tasks show several counter-intuitive phenomena: for settings, individually pruning for each language does not induce a better result; for algorithms, the simplest method performs the best; for efficiency, a fast model does not imply that it is also small. To facilitate the comparison on all sparsity levels, we present Dynamic Sparsification, a simple approach that allows training the model once and adapting to different model sizes at inference. We hope this work fills the gap in the study of structured pruning on multilingual pre-trained models and sheds light on future research.
Anthology ID:
2022.acl-long.130
Volume:
Proceedings of the 60th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers)
Month:
May
Year:
2022
Address:
Dublin, Ireland
Venue:
ACL
SIG:
Publisher:
Association for Computational Linguistics
Note:
Pages:
1852–1865
Language:
URL:
https://aclanthology.org/2022.acl-long.130
DOI:
10.18653/v1/2022.acl-long.130
Bibkey:
Cite (ACL):
Yanyang Li, Fuli Luo, Runxin Xu, Songfang Huang, Fei Huang, and Liwei Wang. 2022. Probing Structured Pruning on Multilingual Pre-trained Models: Settings, Algorithms, and Efficiency. In Proceedings of the 60th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), pages 1852–1865, Dublin, Ireland. Association for Computational Linguistics.
Cite (Informal):
Probing Structured Pruning on Multilingual Pre-trained Models: Settings, Algorithms, and Efficiency (Li et al., ACL 2022)
Copy Citation:
PDF:
https://preview.aclanthology.org/ingestion-script-update/2022.acl-long.130.pdf
Software:
 2022.acl-long.130.software.zip
Data
BUCCCC100MLQAPAWS-XTyDi QAXNLIXQuAD