Low-resource machine translation using MaTrEx

Yanjun Ma, Tsuyoshi Okita, Özlem Çetinoğlu, Jinhua Du, Andy Way


Abstract
In this paper, we give a description of the Machine Translation (MT) system developed at DCU that was used for our fourth participation in the evaluation campaign of the International Workshop on Spoken Language Translation (IWSLT 2009). Two techniques are deployed in our system in order to improve the translation quality in a low-resource scenario. The first technique is to use multiple segmentations in MT training and to utilise word lattices in decoding stage. The second technique is used to select the optimal training data that can be used to build MT systems. In this year’s participation, we use three different prototype SMT systems, and the output from each system are combined using standard system combination method. Our system is the top system for Chinese–English CHALLENGE task in terms of BLEU score.
Anthology ID:
2009.iwslt-evaluation.4
Volume:
Proceedings of the 6th International Workshop on Spoken Language Translation: Evaluation Campaign
Month:
December 1-2
Year:
2009
Address:
Tokyo, Japan
Venue:
IWSLT
SIG:
SIGSLT
Publisher:
Note:
Pages:
29–36
Language:
URL:
https://aclanthology.org/2009.iwslt-evaluation.4
DOI:
Bibkey:
Cite (ACL):
Yanjun Ma, Tsuyoshi Okita, Özlem Çetinoğlu, Jinhua Du, and Andy Way. 2009. Low-resource machine translation using MaTrEx. In Proceedings of the 6th International Workshop on Spoken Language Translation: Evaluation Campaign, pages 29–36, Tokyo, Japan.
Cite (Informal):
Low-resource machine translation using MaTrEx (Ma et al., IWSLT 2009)
Copy Citation:
PDF:
https://preview.aclanthology.org/naacl-24-ws-corrections/2009.iwslt-evaluation.4.pdf
Presentation:
 2009.iwslt-evaluation.4.Presentation.pdf