Neural Speech Translation at AppTek

Evgeny Matusov, Patrick Wilken, Parnia Bahar, Julian Schamper, Pavel Golik, Albert Zeyer, Joan Albert Silvestre-Cerda, Adrià Martínez-Villaronga, Hendrik Pesch, Jan-Thorsten Peter


Abstract
This work describes AppTek’s speech translation pipeline that includes strong state-of-the-art automatic speech recognition (ASR) and neural machine translation (NMT) components. We show how these components can be tightly coupled by encoding ASR confusion networks, as well as ASR-like noise adaptation, vocabulary normalization, and implicit punctuation prediction during translation. In another experimental setup, we propose a direct speech translation approach that can be scaled to translation tasks with large amounts of text-only parallel training data but a limited number of hours of recorded and human-translated speech.
Anthology ID:
2018.iwslt-1.15
Volume:
Proceedings of the 15th International Conference on Spoken Language Translation
Month:
October 29-30
Year:
2018
Address:
Brussels
Venues:
EMNLP | IWSLT
SIG:
Publisher:
International Conference on Spoken Language Translation
Note:
Pages:
104–111
Language:
URL:
https://aclanthology.org/2018.iwslt-1.15
DOI:
Bibkey:
Cite (ACL):
Evgeny Matusov, Patrick Wilken, Parnia Bahar, Julian Schamper, Pavel Golik, Albert Zeyer, Joan Albert Silvestre-Cerda, Adrià Martínez-Villaronga, Hendrik Pesch, and Jan-Thorsten Peter. 2018. Neural Speech Translation at AppTek. In Proceedings of the 15th International Conference on Spoken Language Translation, pages 104–111, Brussels. International Conference on Spoken Language Translation.
Cite (Informal):
Neural Speech Translation at AppTek (Matusov et al., IWSLT 2018)
Copy Citation:
PDF:
https://preview.aclanthology.org/update-css-js/2018.iwslt-1.15.pdf