Spoken language translation using automatically transcribed text in training

Stephan Peitz, Simon Wiesler, Markus Nußbaum-Thom, Hermann Ney


Abstract
In spoken language translation a machine translation system takes speech as input and translates it into another language. A standard machine translation system is trained on written language data and expects written language as input. In this paper we propose an approach to close the gap between the output of automatic speech recognition and the input of machine translation by training the translation system on automatically transcribed speech. In our experiments we show improvements of up to 0.9 BLEU points on the IWSLT 2012 English-to-French speech translation task.
Anthology ID:
2012.iwslt-papers.18
Volume:
Proceedings of the 9th International Workshop on Spoken Language Translation: Papers
Month:
December 6-7
Year:
2012
Address:
Hong Kong, Table of contents
Venue:
IWSLT
SIG:
Publisher:
Note:
Pages:
276–283
Language:
URL:
https://aclanthology.org/2012.iwslt-papers.18
DOI:
Bibkey:
Cite (ACL):
Stephan Peitz, Simon Wiesler, Markus Nußbaum-Thom, and Hermann Ney. 2012. Spoken language translation using automatically transcribed text in training. In Proceedings of the 9th International Workshop on Spoken Language Translation: Papers, pages 276–283, Hong Kong, Table of contents.
Cite (Informal):
Spoken language translation using automatically transcribed text in training (Peitz et al., IWSLT 2012)
Copy Citation:
PDF:
https://preview.aclanthology.org/update-css-js/2012.iwslt-papers.18.pdf