Abstract
Punctuation prediction is an important task in spoken language translation and can be performed by using a monolingual phrase-based translation system to translate from unpunctuated to text with punctuation. However, a punctuation prediction system based on phrase-based translation is not able to capture long-range dependencies between words and punctuation marks. In this paper, we propose to employ hierarchical translation in place of phrase-based translation and show that this approach is more robust for unseen word sequences. Furthermore, we analyze different optimization criteria for tuning the scaling factors of a monolingual statistical machine translation system. In our experiments, we compare the new approach with other punctuation prediction methods and show improvements in terms of F1-Score and BLEU on the IWSLT 2014 German→English and English→French translation tasks.- Anthology ID:
- 2014.iwslt-papers.17
- Volume:
- Proceedings of the 11th International Workshop on Spoken Language Translation: Papers
- Month:
- December 4-5
- Year:
- 2014
- Address:
- Lake Tahoe, California
- Venue:
- IWSLT
- SIG:
- SIGSLT
- Publisher:
- Note:
- Pages:
- 271–278
- Language:
- URL:
- https://aclanthology.org/2014.iwslt-papers.17
- DOI:
- Cite (ACL):
- Stephan Peitz, Markus Freitag, and Hermann Ney. 2014. Better punctuation prediction with hierarchical phrase-based translation. In Proceedings of the 11th International Workshop on Spoken Language Translation: Papers, pages 271–278, Lake Tahoe, California.
- Cite (Informal):
- Better punctuation prediction with hierarchical phrase-based translation (Peitz et al., IWSLT 2014)
- PDF:
- https://preview.aclanthology.org/ingestion-script-update/2014.iwslt-papers.17.pdf