MDI adaptation for the lazy: avoiding normalization in LM adaptation for lecture translation

Nick Ruiz; Marcello Federico

MDI adaptation for the lazy: avoiding normalization in LM adaptation for lecture translation

Abstract

This paper provides a fast alternative to Minimum Discrimination Information-based language model adaptation for statistical machine translation. We provide an alternative to computing a normalization term that requires computing full model probabilities (including back-off probabilities) for all n-grams. Rather than re-estimating an entire language model, our Lazy MDI approach leverages a smoothed unigram ratio between an adaptation text and the background language model to scale only the n-gram probabilities corresponding to translation options gathered by the SMT decoder. The effects of the unigram ratio are scaled by adding an additional feature weight to the log-linear discriminative model. We present results on the IWSLT 2012 TED talk translation task and show that Lazy MDI provides comparable language model adaptation performance to classic MDI.

Anthology ID:: 2012.iwslt-papers.14
Volume:: Proceedings of the 9th International Workshop on Spoken Language Translation: Papers
Month:: December 6-7
Year:: 2012
Address:: Hong Kong, Table of contents
Venue:: IWSLT
SIG:: SIGSLT
Publisher:
Note:
Pages:: 244–251
Language:
URL:: https://aclanthology.org/2012.iwslt-papers.14
DOI:
Bibkey:
Cite (ACL):: Nick Ruiz and Marcello Federico. 2012. MDI adaptation for the lazy: avoiding normalization in LM adaptation for lecture translation. In Proceedings of the 9th International Workshop on Spoken Language Translation: Papers, pages 244–251, Hong Kong, Table of contents.
Cite (Informal):: MDI adaptation for the lazy: avoiding normalization in LM adaptation for lecture translation (Ruiz & Federico, IWSLT 2012)
Copy Citation:
PDF:: https://preview.aclanthology.org/bionlp-24-ingestion/2012.iwslt-papers.14.pdf

PDF Search