PEMA: An Offsite-Tunable Plug-in External Memory Adaptation for Language Models

HyunJin Kim, Young Jin Kim, JinYeong Bak


Abstract
Pre-trained language models (PLMs) show impressive performance in various downstream NLP tasks. However, pre-training large language models demands substantial memory and training compute. Furthermore, due to the substantial resources required, many PLM weights are confidential. Consequently, users are compelled to share their data with model owners for fine-tuning specific tasks. To overcome the limitations, we introduce Plug-in External Memory Adaptation (PEMA), a Parameter-Efficient Fine-Tuning (PEFT) method, enabling PLM fine-tuning without requiring access to all the weights. PEMA integrates with context representations from test data during inference to perform downstream tasks. It uses external memory to store PLM-generated context representations mapped with target tokens. Our method utilizes weight matrices of LoRA-like bottlenecked adapter in the PLM’s final layer to enhance efficiency. Our approach also includes Gradual Unrolling, a novel interpolation strategy to improve generation quality. We validate PEMA’s effectiveness through experiments on syntactic and real datasets for machine translation and style transfer. Our findings show that PEMA outperforms other PEFT approaches in memory and latency efficiency for training, and also excels in maintaining sentence meaning and generating appropriate language and styles.
Anthology ID:
2024.naacl-long.336
Volume:
Proceedings of the 2024 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies (Volume 1: Long Papers)
Month:
June
Year:
2024
Address:
Mexico City, Mexico
Editors:
Kevin Duh, Helena Gomez, Steven Bethard
Venue:
NAACL
SIG:
Publisher:
Association for Computational Linguistics
Note:
Pages:
6045–6064
Language:
URL:
https://aclanthology.org/2024.naacl-long.336
DOI:
Bibkey:
Cite (ACL):
HyunJin Kim, Young Jin Kim, and JinYeong Bak. 2024. PEMA: An Offsite-Tunable Plug-in External Memory Adaptation for Language Models. In Proceedings of the 2024 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies (Volume 1: Long Papers), pages 6045–6064, Mexico City, Mexico. Association for Computational Linguistics.
Cite (Informal):
PEMA: An Offsite-Tunable Plug-in External Memory Adaptation for Language Models (Kim et al., NAACL 2024)
Copy Citation:
PDF:
https://preview.aclanthology.org/nschneid-patch-3/2024.naacl-long.336.pdf
Copyright:
 2024.naacl-long.336.copyright.pdf