Abstract
We present the CMU Syntax Augmented Machine Translation System that was used in the IWSLT-08 evaluation campaign. We participated in the Full-BTEC data track for Chinese-English translation, focusing on transcript translation. For this year’s evaluation, we ported the Syntax Augmented MT toolkit [1] to the Hadoop MapReduce [2] parallel processing architecture, allowing us to efficiently run experiments evaluating a novel “wider pipelines” approach to integrate evidence from N -best alignments into our translation models. We describe each step of the MapReduce pipeline as it is implemented in the open-source SAMT toolkit, and show improvements in translation quality by using N-best alignments in both hierarchical and syntax augmented translation systems.- Anthology ID:
- 2008.iwslt-evaluation.2
- Volume:
- Proceedings of the 5th International Workshop on Spoken Language Translation: Evaluation Campaign
- Month:
- October 20-21
- Year:
- 2008
- Address:
- Waikiki, Hawaii
- Venue:
- IWSLT
- SIG:
- SIGSLT
- Publisher:
- Note:
- Pages:
- 18–25
- Language:
- URL:
- https://aclanthology.org/2008.iwslt-evaluation.2
- DOI:
- Cite (ACL):
- Andreas Zollmann, Ashish Venugopal, and Stephan Vogel. 2008. The CMU syntax-augmented machine translation system: SAMT on Hadoop with n-best alignments.. In Proceedings of the 5th International Workshop on Spoken Language Translation: Evaluation Campaign, pages 18–25, Waikiki, Hawaii.
- Cite (Informal):
- The CMU syntax-augmented machine translation system: SAMT on Hadoop with n-best alignments. (Zollmann et al., IWSLT 2008)
- PDF:
- https://preview.aclanthology.org/auto-file-uploads/2008.iwslt-evaluation.2.pdf