Evaluating Parameter Efficient Learning for Generation

Peng Xu; Mostofa Patwary; Shrimai Prabhumoye; Virginia Adams; Ryan Prenger; Wei Ping; Nayeon Lee; Mohammad Shoeybi; Bryan Catanzaro

doi:10.18653/v1/2022.emnlp-main.319

Evaluating Parameter Efficient Learning for Generation

Peng Xu, Mostofa Patwary, Shrimai Prabhumoye, Virginia Adams, Ryan Prenger, Wei Ping, Nayeon Lee, Mohammad Shoeybi, Bryan Catanzaro

Abstract

Parameter efficient learning methods (PERMs)have recently gained significant attention asthey provide an efficient way for pre-trainedlanguage models (PLMs) to adapt to a downstream task. However, these conclusions aremostly drawn from in-domain evaluations overthe full training set. In this paper, we presentcomparisons between PERMs and finetuningfrom three new perspectives: (1) the effect ofsample and model size to in-domain evaluations, (2) generalization to unseen domains andnew datasets, and (3) the faithfulness of generations. Our results show that for in-domainsettings (a) there is a cross point of samplesize for which PERMs will perform better thanfinetuning when training with fewer samples,and (b) larger PLMs have larger cross points.For cross-domain and cross-dataset cases, weshow that (a) Adapter (Houlsby et al., 2019)performs the best amongst all the PERMs studied here, and (b) it outperforms finetuning ifthe task dataset is below a certain size. Wealso compare the faithfulness of generationsand show that PERMs can achieve better faithfulness score than finetuning, especially forsmall training set, by as much as 6%. Finally,we apply Adapter to MT-NLG 530b (Smithet al., 2022) and achieve new state-of-the-artresults on Xsum (Narayan et al., 2018) for allROUGE scores (ROUGE-1 49.17, ROUGE-227.20, ROUGE-L 40.98).

Anthology ID:: 2022.emnlp-main.319
Volume:: Proceedings of the 2022 Conference on Empirical Methods in Natural Language Processing
Month:: December
Year:: 2022
Address:: Abu Dhabi, United Arab Emirates
Editors:: Yoav Goldberg, Zornitsa Kozareva, Yue Zhang
Venue:: EMNLP
SIG:
Publisher:: Association for Computational Linguistics
Note:
Pages:: 4824–4833
Language:
URL:: https://aclanthology.org/2022.emnlp-main.319
DOI:: 10.18653/v1/2022.emnlp-main.319
Bibkey:
Cite (ACL):: Peng Xu, Mostofa Patwary, Shrimai Prabhumoye, Virginia Adams, Ryan Prenger, Wei Ping, Nayeon Lee, Mohammad Shoeybi, and Bryan Catanzaro. 2022. Evaluating Parameter Efficient Learning for Generation. In Proceedings of the 2022 Conference on Empirical Methods in Natural Language Processing, pages 4824–4833, Abu Dhabi, United Arab Emirates. Association for Computational Linguistics.
Cite (Informal):: Evaluating Parameter Efficient Learning for Generation (Xu et al., EMNLP 2022)
Copy Citation:
PDF:: https://preview.aclanthology.org/nschneid-patch-2/2022.emnlp-main.319.pdf

PDF Search