Hypoformer: Hybrid Decomposition Transformer for Edge-friendly Neural Machine Translation
Sunzhu Li, Peng Zhang, Guobing Gan, Xiuqing Lv, Benyou Wang, Junqiu Wei, Xin Jiang
Abstract
Transformer has been demonstrated effective in Neural Machine Translation (NMT). However, it is memory-consuming and time-consuming in edge devices, resulting in some difficulties for real-time feedback. To compress and accelerate Transformer, we propose a Hybrid Tensor-Train (HTT) decomposition, which retains full rank and meanwhile reduces operations and parameters. A Transformer using HTT, named Hypoformer, consistently and notably outperforms the recent light-weight SOTA methods on three standard translation tasks under different parameter and speed scales. In extreme low resource scenarios, Hypoformer has 7.1 points absolute improvement in BLEU and 1.27 X speedup than vanilla Transformer on IWSLT’14 De-En task.- Anthology ID:
- 2022.emnlp-main.475
- Volume:
- Proceedings of the 2022 Conference on Empirical Methods in Natural Language Processing
- Month:
- December
- Year:
- 2022
- Address:
- Abu Dhabi, United Arab Emirates
- Venue:
- EMNLP
- SIG:
- Publisher:
- Association for Computational Linguistics
- Note:
- Pages:
- 7056–7068
- Language:
- URL:
- https://aclanthology.org/2022.emnlp-main.475
- DOI:
- Cite (ACL):
- Sunzhu Li, Peng Zhang, Guobing Gan, Xiuqing Lv, Benyou Wang, Junqiu Wei, and Xin Jiang. 2022. Hypoformer: Hybrid Decomposition Transformer for Edge-friendly Neural Machine Translation. In Proceedings of the 2022 Conference on Empirical Methods in Natural Language Processing, pages 7056–7068, Abu Dhabi, United Arab Emirates. Association for Computational Linguistics.
- Cite (Informal):
- Hypoformer: Hybrid Decomposition Transformer for Edge-friendly Neural Machine Translation (Li et al., EMNLP 2022)
- PDF:
- https://preview.aclanthology.org/nodalida-main-page/2022.emnlp-main.475.pdf