Howard Johnson


2012

The technique of pruning phrase tables that are used for statistical machine translation (SMT) can achieve substantial reductions in bulk and improve translation quality, especially for very large corpora such at the Giga-FrEn. This can be further improved by conditioning each significance test on other phrase pair co-occurrence counts resulting in an additional reduction in size and increase in BLEU score. A series of experiments using Moses and the WMT11 corpora for French to English have been performed to quantify the improvement. By adhering strictly to the recommendations for the WMT11 baseline system, a strong reproducible research baseline was employed.

2011

2010

2007

2006

2005

2003