Abstract
Autoregressive decoding is the only part of sequence-to-sequence models that prevents them from massive parallelization at inference time. Non-autoregressive models enable the decoder to generate all output symbols independently in parallel. We present a novel non-autoregressive architecture based on connectionist temporal classification and evaluate it on the task of neural machine translation. Unlike other non-autoregressive methods which operate in several steps, our model can be trained end-to-end. We conduct experiments on the WMT English-Romanian and English-German datasets. Our models achieve a significant speedup over the autoregressive models, keeping the translation quality comparable to other non-autoregressive models.- Anthology ID:
- D18-1336
- Volume:
- Proceedings of the 2018 Conference on Empirical Methods in Natural Language Processing
- Month:
- October-November
- Year:
- 2018
- Address:
- Brussels, Belgium
- Editors:
- Ellen Riloff, David Chiang, Julia Hockenmaier, Jun’ichi Tsujii
- Venue:
- EMNLP
- SIG:
- SIGDAT
- Publisher:
- Association for Computational Linguistics
- Note:
- Pages:
- 3016–3021
- Language:
- URL:
- https://aclanthology.org/D18-1336
- DOI:
- 10.18653/v1/D18-1336
- Cite (ACL):
- Jindřich Libovický and Jindřich Helcl. 2018. End-to-End Non-Autoregressive Neural Machine Translation with Connectionist Temporal Classification. In Proceedings of the 2018 Conference on Empirical Methods in Natural Language Processing, pages 3016–3021, Brussels, Belgium. Association for Computational Linguistics.
- Cite (Informal):
- End-to-End Non-Autoregressive Neural Machine Translation with Connectionist Temporal Classification (Libovický & Helcl, EMNLP 2018)
- PDF:
- https://preview.aclanthology.org/ml4al-ingestion/D18-1336.pdf
- Data
- WMT 2014, WMT 2016