The RWTH Aachen machine translation system for IWSLT 2010

Saab Mansour, Stephan Peitz, David Vilar, Joern Wuebker, Hermann Ney


Abstract
In this paper we describe the statistical machine translation system of the RWTH Aachen University developed for the translation task of the IWSLT 2010. This year, we participated in the BTEC translation task for the Arabic to English language direction. We experimented with two state-of-theart decoders: phrase-based and hierarchical-based decoders. Extensions to the decoders included phrase training (as opposed to heuristic phrase extraction) for the phrase-based decoder, and soft syntactic features for the hierarchical decoder. Additionally, we experimented with various rule-based and statistical-based segmenters for Arabic. Due to the different decoders and the different methodologies that we apply for segmentation, we expect that there will be complimentary variation in the results achieved by each system. The next step would be to exploit these variations and achieve better results by combining the systems. We try different strategies for system combination and report significant improvements over the best single system.
Anthology ID:
2010.iwslt-evaluation.22
Volume:
Proceedings of the 7th International Workshop on Spoken Language Translation: Evaluation Campaign
Month:
December 2-3
Year:
2010
Address:
Paris, France
Venue:
IWSLT
SIG:
SIGSLT
Publisher:
Note:
Pages:
163–168
Language:
URL:
https://aclanthology.org/2010.iwslt-evaluation.22
DOI:
Bibkey:
Cite (ACL):
Saab Mansour, Stephan Peitz, David Vilar, Joern Wuebker, and Hermann Ney. 2010. The RWTH Aachen machine translation system for IWSLT 2010. In Proceedings of the 7th International Workshop on Spoken Language Translation: Evaluation Campaign, pages 163–168, Paris, France.
Cite (Informal):
The RWTH Aachen machine translation system for IWSLT 2010 (Mansour et al., IWSLT 2010)
Copy Citation:
PDF:
https://preview.aclanthology.org/landing_page/2010.iwslt-evaluation.22.pdf