Sketching as a Tool for Understanding and Accelerating Self-attention for Long Sequences

Yifan Chen; Qi Zeng; Dilek Hakkani-Tur; Di Jin; Heng Ji; Yun Yang

doi:10.18653/v1/2022.naacl-main.381

Sketching as a Tool for Understanding and Accelerating Self-attention for Long Sequences

Yifan Chen, Qi Zeng, Dilek Hakkani-Tur, Di Jin, Heng Ji, Yun Yang

Abstract

Transformer-based models are not efficient in processing long sequences due to the quadratic space and time complexity of the self-attention modules. To address this limitation, Linformer and Informer reduce the quadratic complexity to linear (modulo logarithmic factors) via low-dimensional projection and row selection, respectively. These two models are intrinsically connected, and to understand their connection we introduce a theoretical framework of matrix sketching. Based on the theoretical analysis, we propose Skeinformer to accelerate self-attention and further improve the accuracy of matrix approximation to self-attention with column sampling, adaptive row normalization and pilot sampling reutilization. Experiments on the Long Range Arena benchmark demonstrate that our methods outperform alternatives with a consistently smaller time/space footprint.

Anthology ID:: 2022.naacl-main.381
Volume:: Proceedings of the 2022 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies
Month:: July
Year:: 2022
Address:: Seattle, United States
Editors:: Marine Carpuat, Marie-Catherine de Marneffe, Ivan Vladimir Meza Ruiz
Venue:: NAACL
SIG:
Publisher:: Association for Computational Linguistics
Note:
Pages:: 5187–5199
Language:
URL:: https://preview.aclanthology.org/build-pipeline-with-new-library/2022.naacl-main.381/
DOI:: 10.18653/v1/2022.naacl-main.381
Bibkey:
Cite (ACL):: Yifan Chen, Qi Zeng, Dilek Hakkani-Tur, Di Jin, Heng Ji, and Yun Yang. 2022. Sketching as a Tool for Understanding and Accelerating Self-attention for Long Sequences. In Proceedings of the 2022 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, pages 5187–5199, Seattle, United States. Association for Computational Linguistics.
Cite (Informal):: Sketching as a Tool for Understanding and Accelerating Self-attention for Long Sequences (Chen et al., NAACL 2022)
Copy Citation:
PDF:: https://preview.aclanthology.org/build-pipeline-with-new-library/2022.naacl-main.381.pdf
Software:: 2022.naacl-main.381.software.zip
Video:: https://preview.aclanthology.org/build-pipeline-with-new-library/2022.naacl-main.381.mp4
Code: pkuzengqi/skeinformer
Data: LRA, ListOps

PDF Search Code Software Video Fix metadata