HiFi: High-Information Attention Heads Hold for Parameter-Efficient Model Adaptation

Anchun Gui, Han Xiao


Abstract
To fully leverage the advantages of large-scale pre-trained language models (PLMs) on downstream tasks, it has become a ubiquitous adaptation paradigm to fine-tune the entire parameters of PLMs. However, this paradigm poses issues of inefficient updating and resource over-consuming for fine-tuning in data-scarce and resource-limited scenarios, because of the large scale of parameters in PLMs. To alleviate these concerns, in this paper, we propose a parameter-efficient fine-tuning method HiFi, that is, only the highly informative and strongly correlated attention heads for the specific task are fine-tuned. To search for those significant attention heads, we develop a novel framework to analyze the effectiveness of heads. Specifically, we first model the relationship between heads into a graph from two perspectives of information richness and correlation, and then apply PageRank algorithm to determine the relative importance of each head. Extensive experiments on the GLUE benchmark demonstrate the effectiveness of our method, and show that HiFi obtains state-of-the-art performance over the prior baselines.
Anthology ID:
2023.acl-long.475
Volume:
Proceedings of the 61st Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers)
Month:
July
Year:
2023
Address:
Toronto, Canada
Editors:
Anna Rogers, Jordan Boyd-Graber, Naoaki Okazaki
Venue:
ACL
SIG:
Publisher:
Association for Computational Linguistics
Note:
Pages:
8521–8537
Language:
URL:
https://aclanthology.org/2023.acl-long.475
DOI:
10.18653/v1/2023.acl-long.475
Bibkey:
Cite (ACL):
Anchun Gui and Han Xiao. 2023. HiFi: High-Information Attention Heads Hold for Parameter-Efficient Model Adaptation. In Proceedings of the 61st Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), pages 8521–8537, Toronto, Canada. Association for Computational Linguistics.
Cite (Informal):
HiFi: High-Information Attention Heads Hold for Parameter-Efficient Model Adaptation (Gui & Xiao, ACL 2023)
Copy Citation:
PDF:
https://preview.aclanthology.org/nschneid-patch-2/2023.acl-long.475.pdf
Video:
 https://preview.aclanthology.org/nschneid-patch-2/2023.acl-long.475.mp4