TÜBİTAK Turkish-English submissions for IWSLT 2013
Ertuğrul Yılmaz, İlknur Durgar El-Kahlout, Burak Aydın, Zişan Sıla Özil, Coşkun Mermer
Abstract
This paper describes the TU ̈ B ̇ITAK Turkish-English submissions in both directions for the IWSLT’13 Evaluation Campaign TED Machine Translation (MT) track. We develop both phrase-based and hierarchical phrase-based statistical machine translation (SMT) systems based on Turkish wordand morpheme-level representations. We augment training data with content words extracted from itself and experiment with reverse word order for source languages. For the Turkish-to-English direction, we use Gigaword corpus as an additional language model with the training data. For the English-to-Turkish direction, we implemented a wide coverage Turkish word generator to generate words from the stem and morpheme sequences. Finally, we perform system combination of the different systems produced with different word alignments.- Anthology ID:
- 2013.iwslt-evaluation.19
- Volume:
- Proceedings of the 10th International Workshop on Spoken Language Translation: Evaluation Campaign
- Month:
- December 5-6
- Year:
- 2013
- Address:
- Heidelberg, Germany
- Editor:
- Joy Ying Zhang
- Venue:
- IWSLT
- SIG:
- SIGSLT
- Publisher:
- Note:
- Pages:
- Language:
- URL:
- https://aclanthology.org/2013.iwslt-evaluation.19
- DOI:
- Cite (ACL):
- Ertuğrul Yılmaz, İlknur Durgar El-Kahlout, Burak Aydın, Zişan Sıla Özil, and Coşkun Mermer. 2013. TÜBİTAK Turkish-English submissions for IWSLT 2013. In Proceedings of the 10th International Workshop on Spoken Language Translation: Evaluation Campaign, Heidelberg, Germany.
- Cite (Informal):
- TÜBİTAK Turkish-English submissions for IWSLT 2013 (Yılmaz et al., IWSLT 2013)
- PDF:
- https://preview.aclanthology.org/nschneid-patch-2/2013.iwslt-evaluation.19.pdf