Voting on N-grams for Machine Translation System Combination

Kenneth Heafield; Alon Lavie

Voting on N-grams for Machine Translation System Combination

Abstract

System combination exploits differences between machine translation systems to form a combined translation from several system outputs. Core to this process are features that reward n-gram matches between a candidate combination and each system output. Systems differ in performance at the n-gram level despite similar overall scores. We therefore advocate a new feature formulation: for each system and each small n, a feature counts n-gram matches between the system and candidate. We show post-evaluation improvement of 6.67 BLEU over the best system on NIST MT09 Arabic-English test data. Compared to a baseline system combination scheme from WMT 2009, we show improvement in the range of 1 BLEU point.

Anthology ID:: 2010.amta-papers.34
Volume:: Proceedings of the 9th Conference of the Association for Machine Translation in the Americas: Research Papers
Month:: October 31-November 4
Year:: 2010
Address:: Denver, Colorado, USA
Venue:: AMTA
SIG:
Publisher:: Association for Machine Translation in the Americas
Note:
Pages:
Language:
URL:: https://preview.aclanthology.org/nschneid-patch-2/2010.amta-papers.34/
DOI:
Bibkey:
Cite (ACL):: Kenneth Heafield and Alon Lavie. 2010. Voting on N-grams for Machine Translation System Combination. In Proceedings of the 9th Conference of the Association for Machine Translation in the Americas: Research Papers, Denver, Colorado, USA. Association for Machine Translation in the Americas.
Cite (Informal):: Voting on N-grams for Machine Translation System Combination (Heafield & Lavie, AMTA 2010)
Copy Citation:
PDF:: https://preview.aclanthology.org/nschneid-patch-2/2010.amta-papers.34.pdf

PDF Cite Search Fix data