Attention Mechanism with Energy-Friendly Operations

Yu Wan, Baosong Yang, Dayiheng Liu, Rong Xiao, Derek Wong, Haibo Zhang, Boxing Chen, Lidia Chao


Abstract
Attention mechanism has become the dominant module in natural language processing models. It is computationally intensive and depends on massive power-hungry multiplications. In this paper, we rethink variants of attention mechanism from the energy consumption aspects. After reaching the conclusion that the energy costs of several energy-friendly operations are far less than their multiplication counterparts, we build a novel attention model by replacing multiplications with either selective operations or additions. Empirical results on three machine translation tasks demonstrate that the proposed model, against the vanilla one, achieves competitable accuracy while saving 99% and 66% energy during alignment calculation and the whole attention procedure. Our code will be released upon the acceptance.
Anthology ID:
2022.findings-acl.313
Volume:
Findings of the Association for Computational Linguistics: ACL 2022
Month:
May
Year:
2022
Address:
Dublin, Ireland
Editors:
Smaranda Muresan, Preslav Nakov, Aline Villavicencio
Venue:
Findings
SIG:
Publisher:
Association for Computational Linguistics
Note:
Pages:
3969–3976
Language:
URL:
https://aclanthology.org/2022.findings-acl.313
DOI:
10.18653/v1/2022.findings-acl.313
Bibkey:
Cite (ACL):
Yu Wan, Baosong Yang, Dayiheng Liu, Rong Xiao, Derek Wong, Haibo Zhang, Boxing Chen, and Lidia Chao. 2022. Attention Mechanism with Energy-Friendly Operations. In Findings of the Association for Computational Linguistics: ACL 2022, pages 3969–3976, Dublin, Ireland. Association for Computational Linguistics.
Cite (Informal):
Attention Mechanism with Energy-Friendly Operations (Wan et al., Findings 2022)
Copy Citation:
PDF:
https://preview.aclanthology.org/naacl24-info/2022.findings-acl.313.pdf
Code
 nlp2ct/e-att