Start-Before-End and End-to-End: Neural Speech Translation by AppTek and RWTH Aachen University
Parnia Bahar, Patrick Wilken, Tamer Alkhouli, Andreas Guta, Pavel Golik, Evgeny Matusov, Christian Herold
Abstract
AppTek and RWTH Aachen University team together to participate in the offline and simultaneous speech translation tracks of IWSLT 2020. For the offline task, we create both cascaded and end-to-end speech translation systems, paying attention to careful data selection and weighting. In the cascaded approach, we combine high-quality hybrid automatic speech recognition (ASR) with the Transformer-based neural machine translation (NMT). Our end-to-end direct speech translation systems benefit from pretraining of adapted encoder and decoder components, as well as synthetic data and fine-tuning and thus are able to compete with cascaded systems in terms of MT quality. For simultaneous translation, we utilize a novel architecture that makes dynamic decisions, learned from parallel data, to determine when to continue feeding on input or generate output words. Experiments with speech and text input show that even at low latency this architecture leads to superior translation results.- Anthology ID:
- 2020.iwslt-1.3
- Volume:
- Proceedings of the 17th International Conference on Spoken Language Translation
- Month:
- July
- Year:
- 2020
- Address:
- Online
- Editors:
- Marcello Federico, Alex Waibel, Kevin Knight, Satoshi Nakamura, Hermann Ney, Jan Niehues, Sebastian Stüker, Dekai Wu, Joseph Mariani, Francois Yvon
- Venue:
- IWSLT
- SIG:
- SIGSLT
- Publisher:
- Association for Computational Linguistics
- Note:
- Pages:
- 44–54
- Language:
- URL:
- https://aclanthology.org/2020.iwslt-1.3
- DOI:
- 10.18653/v1/2020.iwslt-1.3
- Cite (ACL):
- Parnia Bahar, Patrick Wilken, Tamer Alkhouli, Andreas Guta, Pavel Golik, Evgeny Matusov, and Christian Herold. 2020. Start-Before-End and End-to-End: Neural Speech Translation by AppTek and RWTH Aachen University. In Proceedings of the 17th International Conference on Spoken Language Translation, pages 44–54, Online. Association for Computational Linguistics.
- Cite (Informal):
- Start-Before-End and End-to-End: Neural Speech Translation by AppTek and RWTH Aachen University (Bahar et al., IWSLT 2020)
- PDF:
- https://preview.aclanthology.org/naacl-24-ws-corrections/2020.iwslt-1.3.pdf