Selective Prefix Tuning for Pre-trained Language Models

Hongyi Zhang; Zuchao Li; Ping Wang; Hai Zhao

doi:10.18653/v1/2024.findings-acl.164

Selective Prefix Tuning for Pre-trained Language Models

Hongyi Zhang, Zuchao Li, Ping Wang, Hai Zhao

Abstract

The prevalent approach for optimizing pre-trained language models in downstream tasks is fine-tuning. However, it is both time-consuming and memory-inefficient. In response, a more efficient method called Prefix Tuning, which insert learnable vectors into each Transformer layers, has been proposed and proven effective. Recent investigations reveal that prefix tokens carry context-specific information, prompting the hypothesis that enhancing their specialization can improve model performance. To address this, we propose Selective Prefix Tuning (SPT), integrating a selective mechanism inspired by selective self-attention. Additionally, we introduce Selective Loss (SL) to encourage diversity in prefix tokens. Extensive experiments validate the effectiveness of SPT in sentence and token classification tasks. We contribute insight into understanding the role of prefix in model adaptation.

Anthology ID:: 2024.findings-acl.164
Volume:: Findings of the Association for Computational Linguistics: ACL 2024
Month:: August
Year:: 2024
Address:: Bangkok, Thailand
Editors:: Lun-Wei Ku, Andre Martins, Vivek Srikumar
Venue:: Findings
SIG:
Publisher:: Association for Computational Linguistics
Note:
Pages:: 2806–2813
Language:
URL:: https://preview.aclanthology.org/jlcl-multiple-ingestion/2024.findings-acl.164/
DOI:: 10.18653/v1/2024.findings-acl.164
Bibkey:
Cite (ACL):: Hongyi Zhang, Zuchao Li, Ping Wang, and Hai Zhao. 2024. Selective Prefix Tuning for Pre-trained Language Models. In Findings of the Association for Computational Linguistics: ACL 2024, pages 2806–2813, Bangkok, Thailand. Association for Computational Linguistics.
Cite (Informal):: Selective Prefix Tuning for Pre-trained Language Models (Zhang et al., Findings 2024)
Copy Citation:
PDF:: https://preview.aclanthology.org/jlcl-multiple-ingestion/2024.findings-acl.164.pdf

PDF Cite Search Fix data