Pruning Adatperfusion with Lottery Ticket Hypothesis
Jiarun Wu, Qingliang Chen, Zeguan Xiao, Yuliang Gu, Mengsi Sun
Abstract
Pre-trained language models have shown great success in multiple downstream tasks. However, they are computationally expensive to fine-tune. Thus, transfer learning with adapter modules has been introduced to alleviate this problem, helping to extract knowledge of the downstream tasks. Adapterfusion models are an example of the transformers-with-adapter-modules, which merge multiple adapters to incorporate knowledge from different tasks. However, merging multiple adapters will inevitably cause redundancies, increasing the training and inference time massively. Therefore, in this paper, we propose an approach to identify the influence of each adapter module and a novel way to prune adapters based on the prestigious Lottery Ticket Hypothesis. Experiments on GLUE datasets show that the pruned Adapterfusion model with our scheme can achieve state-of-the-art results, reducing sizes significantly while keeping performance intact.- Anthology ID:
- 2022.findings-naacl.123
- Volume:
- Findings of the Association for Computational Linguistics: NAACL 2022
- Month:
- July
- Year:
- 2022
- Address:
- Seattle, United States
- Venue:
- Findings
- SIG:
- Publisher:
- Association for Computational Linguistics
- Note:
- Pages:
- 1632–1646
- Language:
- URL:
- https://aclanthology.org/2022.findings-naacl.123
- DOI:
- 10.18653/v1/2022.findings-naacl.123
- Cite (ACL):
- Jiarun Wu, Qingliang Chen, Zeguan Xiao, Yuliang Gu, and Mengsi Sun. 2022. Pruning Adatperfusion with Lottery Ticket Hypothesis. In Findings of the Association for Computational Linguistics: NAACL 2022, pages 1632–1646, Seattle, United States. Association for Computational Linguistics.
- Cite (Informal):
- Pruning Adatperfusion with Lottery Ticket Hypothesis (Wu et al., Findings 2022)
- PDF:
- https://preview.aclanthology.org/starsem-semeval-split/2022.findings-naacl.123.pdf
- Data
- GLUE, MultiNLI, QNLI