PATS: Sensitivity-aware Noisy Learning for Pretrained Language Models

Yupeng Zhang, Hongzhi Zhang, Sirui Wang, Wei Wu, Zhoujun Li


Abstract
A wide range of NLP tasks benefit from the fine-tuning of pretrained language models (PLMs). However, a number of redundant parameters which contribute less to the downstream task are observed in a directly fine-tuned model. We consider the gap between pretraining and downstream tasks hinders the training of these redundant parameters, and results in a suboptimal performance of the overall model. In this paper, we present PATS (Perturbation According To Sensitivity), a noisy training mechanism which considers each parameter’s importance in the downstream task to help fine-tune PLMs. The main idea of PATS is to add bigger noise to parameters with lower sensitivity and vice versa, in order to activate more parameters’ contributions to downstream tasks without affecting the sensitive ones much. Extensive experiments conducted on different tasks of the GLUE benchmark show PATS can consistently empower the fine-tuning of different sizes of PLMs, and the parameters in the well-performing models always have more concentrated distributions of sensitivities, which experimentally proves the effectiveness of our method.
Anthology ID:
2022.emnlp-main.241
Volume:
Proceedings of the 2022 Conference on Empirical Methods in Natural Language Processing
Month:
December
Year:
2022
Address:
Abu Dhabi, United Arab Emirates
Venue:
EMNLP
SIG:
Publisher:
Association for Computational Linguistics
Note:
Pages:
3680–3687
Language:
URL:
https://aclanthology.org/2022.emnlp-main.241
DOI:
10.18653/v1/2022.emnlp-main.241
Bibkey:
Cite (ACL):
Yupeng Zhang, Hongzhi Zhang, Sirui Wang, Wei Wu, and Zhoujun Li. 2022. PATS: Sensitivity-aware Noisy Learning for Pretrained Language Models. In Proceedings of the 2022 Conference on Empirical Methods in Natural Language Processing, pages 3680–3687, Abu Dhabi, United Arab Emirates. Association for Computational Linguistics.
Cite (Informal):
PATS: Sensitivity-aware Noisy Learning for Pretrained Language Models (Zhang et al., EMNLP 2022)
Copy Citation:
PDF:
https://preview.aclanthology.org/remove-xml-comments/2022.emnlp-main.241.pdf