MaTrEx: the DCU machine translation system for IWSLT 2007

Hany Hassan, Yanjun Ma, Andy Way


Abstract
In this paper, we give a description of the machine translation system developed at DCU that was used for our second participation in the evaluation campaign of the International Workshop on Spoken Language Translation (IWSLT 2007). In this participation, we focus on some new methods to improve system quality. Specifically, we try our word packing technique for different language pairs, we smooth our translation tables with out-of-domain word translations for the Arabic–English and Chinese–English tasks in order to solve the high number of out of vocabulary items, and finally we deploy a translation-based model for case and punctuation restoration. We participated in both the classical and challenge tasks for the following translation directions: Chinese–English, Japanese–English and Arabic–English. For the last two tasks, we translated both the single-best ASR hypotheses and the correct recognition results; for Chinese–English, we just translated the correct recognition results. We report the results of the system for the provided evaluation sets, together with some additional experiments carried out following identification of some simple tokenisation errors in the official runs.
Anthology ID:
2007.iwslt-1.10
Volume:
Proceedings of the Fourth International Workshop on Spoken Language Translation
Month:
October 15-16
Year:
2007
Address:
Trento, Italy
Venue:
IWSLT
SIG:
Publisher:
Note:
Pages:
Language:
URL:
https://aclanthology.org/2007.iwslt-1.10
DOI:
Bibkey:
Cite (ACL):
Hany Hassan, Yanjun Ma, and Andy Way. 2007. MaTrEx: the DCU machine translation system for IWSLT 2007. In Proceedings of the Fourth International Workshop on Spoken Language Translation, Trento, Italy.
Cite (Informal):
MaTrEx: the DCU machine translation system for IWSLT 2007 (Hassan et al., IWSLT 2007)
Copy Citation:
PDF:
https://preview.aclanthology.org/update-css-js/2007.iwslt-1.10.pdf