Attention Mechanism with Energy-Friendly Operations
Yu Wan, Baosong Yang, Dayiheng Liu, Rong Xiao, Derek Wong, Haibo Zhang, Boxing Chen, Lidia Chao
Abstract
Attention mechanism has become the dominant module in natural language processing models. It is computationally intensive and depends on massive power-hungry multiplications. In this paper, we rethink variants of attention mechanism from the energy consumption aspects. After reaching the conclusion that the energy costs of several energy-friendly operations are far less than their multiplication counterparts, we build a novel attention model by replacing multiplications with either selective operations or additions. Empirical results on three machine translation tasks demonstrate that the proposed model, against the vanilla one, achieves competitable accuracy while saving 99% and 66% energy during alignment calculation and the whole attention procedure. Our code will be released upon the acceptance.- Anthology ID:
- 2022.findings-acl.313
- Volume:
- Findings of the Association for Computational Linguistics: ACL 2022
- Month:
- May
- Year:
- 2022
- Address:
- Dublin, Ireland
- Editors:
- Smaranda Muresan, Preslav Nakov, Aline Villavicencio
- Venue:
- Findings
- SIG:
- Publisher:
- Association for Computational Linguistics
- Note:
- Pages:
- 3969–3976
- Language:
- URL:
- https://aclanthology.org/2022.findings-acl.313
- DOI:
- 10.18653/v1/2022.findings-acl.313
- Cite (ACL):
- Yu Wan, Baosong Yang, Dayiheng Liu, Rong Xiao, Derek Wong, Haibo Zhang, Boxing Chen, and Lidia Chao. 2022. Attention Mechanism with Energy-Friendly Operations. In Findings of the Association for Computational Linguistics: ACL 2022, pages 3969–3976, Dublin, Ireland. Association for Computational Linguistics.
- Cite (Informal):
- Attention Mechanism with Energy-Friendly Operations (Wan et al., Findings 2022)
- PDF:
- https://preview.aclanthology.org/naacl24-info/2022.findings-acl.313.pdf
- Code
- nlp2ct/e-att