Abstract
This paper reports on FBK’s Machine Translation (MT) submissions at the IWSLT 2012 Evaluation on the TED talk translation tasks. We participated in the English-French and the Arabic-, Dutch-, German-, and Turkish-English translation tasks. Several improvements are reported over our last year baselines. In addition to using fill-up combinations of phrase-tables for domain adaptation, we explore the use of corpora filtering based on cross-entropy to produce concise and accurate translation and language models. We describe challenges encountered in under-resourced languages (Turkish) and language-specific preprocessing needs.- Anthology ID:
- 2012.iwslt-evaluation.6
- Volume:
- Proceedings of the 9th International Workshop on Spoken Language Translation: Evaluation Campaign
- Month:
- December 6-7
- Year:
- 2012
- Address:
- Hong Kong, Table of contents
- Venue:
- IWSLT
- SIG:
- SIGSLT
- Publisher:
- Note:
- Pages:
- 61–68
- Language:
- URL:
- https://aclanthology.org/2012.iwslt-evaluation.6
- DOI:
- Cite (ACL):
- N. Ruiz, A. Bisazza, R. Cattoni, and M. Federico. 2012. FBK’s machine translation systems for IWSLT 2012’s TED lectures. In Proceedings of the 9th International Workshop on Spoken Language Translation: Evaluation Campaign, pages 61–68, Hong Kong, Table of contents.
- Cite (Informal):
- FBK’s machine translation systems for IWSLT 2012’s TED lectures (Ruiz et al., IWSLT 2012)
- PDF:
- https://preview.aclanthology.org/ingest-bitext-workshop/2012.iwslt-evaluation.6.pdf