Abstract
Recently, dynamic early exiting has attracted much attention since it can accelerate the inference speed of pre-trained models (PTMs). However, previous work on early exiting has neglected the intermediate exits’ architectural designs. In this work, we propose a novel framework, Learned Exits and COmparison-based early exiting (LECO) to improve PTMs’ early exiting performances. First, to fully uncover the potentials of multi-exit BERT, we design a novel search space for intermediate exits and employ the idea of differentiable neural architecture search (DNAS) to design proper exit architectures for different intermediate layers automatically. Second, we propose a simple-yet-effective comparison-based early exiting mechanism (COBEE), which can help PTMs achieve better performance and speedup tradeoffs. Extensive experiments show that our LECO achieves the SOTA performances for multi-exit BERT training and dynamic early exiting.- Anthology ID:
- 2023.acl-srw.43
- Volume:
- Proceedings of the 61st Annual Meeting of the Association for Computational Linguistics (Volume 4: Student Research Workshop)
- Month:
- July
- Year:
- 2023
- Address:
- Toronto, Canada
- Editors:
- Vishakh Padmakumar, Gisela Vallejo, Yao Fu
- Venue:
- ACL
- SIG:
- Publisher:
- Association for Computational Linguistics
- Note:
- Pages:
- 298–309
- Language:
- URL:
- https://preview.aclanthology.org/build-pipeline-with-new-library/2023.acl-srw.43/
- DOI:
- 10.18653/v1/2023.acl-srw.43
- Award:
- SRW Best Paper Award
- Cite (ACL):
- Jingfan Zhang, Ming Tan, Pengyu Dai, and Wei Zhu. 2023. LECO: Improving Early Exiting via Learned Exits and Comparison-based Exiting Mechanism. In Proceedings of the 61st Annual Meeting of the Association for Computational Linguistics (Volume 4: Student Research Workshop), pages 298–309, Toronto, Canada. Association for Computational Linguistics.
- Cite (Informal):
- LECO: Improving Early Exiting via Learned Exits and Comparison-based Exiting Mechanism (Zhang et al., ACL 2023)
- PDF:
- https://preview.aclanthology.org/build-pipeline-with-new-library/2023.acl-srw.43.pdf