Efficient Sequence Learning with Group Recurrent Networks
Fei Gao, Lijun Wu, Li Zhao, Tao Qin, Xueqi Cheng, Tie-Yan Liu
Abstract
Recurrent neural networks have achieved state-of-the-art results in many artificial intelligence tasks, such as language modeling, neural machine translation, speech recognition and so on. One of the key factors to these successes is big models. However, training such big models usually takes days or even weeks of time even if using tens of GPU cards. In this paper, we propose an efficient architecture to improve the efficiency of such RNN model training, which adopts the group strategy for recurrent layers, while exploiting the representation rearrangement strategy between layers as well as time steps. To demonstrate the advantages of our models, we conduct experiments on several datasets and tasks. The results show that our architecture achieves comparable or better accuracy comparing with baselines, with a much smaller number of parameters and at a much lower computational cost.- Anthology ID:
- N18-1073
- Volume:
- Proceedings of the 2018 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Volume 1 (Long Papers)
- Month:
- June
- Year:
- 2018
- Address:
- New Orleans, Louisiana
- Editors:
- Marilyn Walker, Heng Ji, Amanda Stent
- Venue:
- NAACL
- SIG:
- Publisher:
- Association for Computational Linguistics
- Note:
- Pages:
- 799–808
- Language:
- URL:
- https://aclanthology.org/N18-1073
- DOI:
- 10.18653/v1/N18-1073
- Cite (ACL):
- Fei Gao, Lijun Wu, Li Zhao, Tao Qin, Xueqi Cheng, and Tie-Yan Liu. 2018. Efficient Sequence Learning with Group Recurrent Networks. In Proceedings of the 2018 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Volume 1 (Long Papers), pages 799–808, New Orleans, Louisiana. Association for Computational Linguistics.
- Cite (Informal):
- Efficient Sequence Learning with Group Recurrent Networks (Gao et al., NAACL 2018)
- PDF:
- https://preview.aclanthology.org/nschneid-patch-2/N18-1073.pdf