ScaLearn: Simple and Highly Parameter-Efficient Task Transfer by Learning to Scale

Markus Frohmann; Carolin Holtermann; Shahed Masoudian; Anne Lauscher; Navid Rekabsaz

doi:10.18653/v1/2024.findings-acl.699

ScaLearn: Simple and Highly Parameter-Efficient Task Transfer by Learning to Scale

Markus Frohmann, Carolin Holtermann, Shahed Masoudian, Anne Lauscher, Navid Rekabsaz

Abstract

Multi-task learning (MTL) has shown considerable practical benefits, particularly when using language models (LMs). While this is commonly achieved by learning tasks under a joint optimization procedure, some methods, such as AdapterFusion, divide the problem into two stages: (i) task learning, where knowledge specific to a task is encapsulated within sets of parameters (e.g., adapters), and (ii) transfer, where this already learned knowledge is leveraged for a target task. This separation of concerns provides numerous benefits (e.g., promoting reusability). However, current two stage MTL introduces a substantial number of additional parameters. We address this issue by leveraging the usefulness of linearly scaling the output representations of source adapters for transfer learning. We introduce ScaLearn, a simple and highly parameter-efficient two-stage MTL method that capitalizes on the knowledge of the source tasks by learning a minimal set of scaling parameters that enable effective transfer to a target task. Our experiments on three benchmarks (GLUE, SuperGLUE, and HumSet) and two encoder LMs show that ScaLearn consistently outperforms strong baselines with a small number of transfer parameters (~0.35% of those of AdapterFusion). Remarkably, we observe that ScaLearn maintains its strong abilities even when further reducing parameters, achieving competitive results with only 8 transfer parameters per target task. Our proposed approach thus demonstrates the power of simple scaling as a promise for more efficient task transfer. Our code is available at https://github.com/CPJKU/ScaLearn.

Anthology ID:: 2024.findings-acl.699
Volume:: Findings of the Association for Computational Linguistics: ACL 2024
Month:: August
Year:: 2024
Address:: Bangkok, Thailand
Editors:: Lun-Wei Ku, Andre Martins, Vivek Srikumar
Venue:: Findings
SIG:
Publisher:: Association for Computational Linguistics
Note:
Pages:: 11743–11776
Language:
URL:: https://preview.aclanthology.org/Author-page-Marten-During-lu/2024.findings-acl.699/
DOI:: 10.18653/v1/2024.findings-acl.699
Bibkey:
Cite (ACL):: Markus Frohmann, Carolin Holtermann, Shahed Masoudian, Anne Lauscher, and Navid Rekabsaz. 2024. ScaLearn: Simple and Highly Parameter-Efficient Task Transfer by Learning to Scale. In Findings of the Association for Computational Linguistics: ACL 2024, pages 11743–11776, Bangkok, Thailand. Association for Computational Linguistics.
Cite (Informal):: ScaLearn: Simple and Highly Parameter-Efficient Task Transfer by Learning to Scale (Frohmann et al., Findings 2024)
Copy Citation:
PDF:: https://preview.aclanthology.org/Author-page-Marten-During-lu/2024.findings-acl.699.pdf

PDF Cite Search Fix data