Modeling punctuation prediction as machine translation

Stephan Peitz, Markus Freitag, Arne Mauser, Hermann Ney


Abstract
Punctuation prediction is an important task in Spoken Language Translation. The output of speech recognition systems does not typically contain punctuation marks. In this paper we analyze different methods for punctuation prediction and show improvements in the quality of the final translation output. In our experiments we compare the different approaches and show improvements of up to 0.8 BLEU points on the IWSLT 2011 English French Speech Translation of Talks task using a translation system to translate from unpunctuated to punctuated text instead of a language model based punctuation prediction method. Furthermore, we do a system combination of the hypotheses of all our different approaches and get an additional improvement of 0.4 points in BLEU.
Anthology ID:
2011.iwslt-papers.7
Volume:
Proceedings of the 8th International Workshop on Spoken Language Translation: Papers
Month:
December 8-9
Year:
2011
Address:
San Francisco, California
Editors:
Marcello Federico, Mei-Yuh Hwang, Margit Rödder, Sebastian Stüker
Venue:
IWSLT
SIG:
SIGSLT
Publisher:
Note:
Pages:
238–245
Language:
URL:
https://aclanthology.org/2011.iwslt-papers.7
DOI:
Bibkey:
Cite (ACL):
Stephan Peitz, Markus Freitag, Arne Mauser, and Hermann Ney. 2011. Modeling punctuation prediction as machine translation. In Proceedings of the 8th International Workshop on Spoken Language Translation: Papers, pages 238–245, San Francisco, California.
Cite (Informal):
Modeling punctuation prediction as machine translation (Peitz et al., IWSLT 2011)
Copy Citation:
PDF:
https://preview.aclanthology.org/emnlp-22-attachments/2011.iwslt-papers.7.pdf