A Discriminative Lexicon Model for Complex Morphology

Minwoo Jeong, Kristina Toutanova, Hisami Suzuki, Chris Quirk


Abstract
This paper describes successful applications of discriminative lexicon models to the statistical machine translation (SMT) systems into morphologically complex languages. We extend the previous work on discriminatively trained lexicon models to include more contextual information in making lexical selection decisions by building a single global log-linear model of translation selection. In offline experiments, we show that the use of the expanded contextual information, including morphological and syntactic features, help better predict words in three target languages with complex morphology (Bulgarian, Czech and Korean). We also show that these improved lexical prediction models make a positive impact in the end-to-end SMT scenario from English to these languages.
Anthology ID:
2010.amta-papers.33
Volume:
Proceedings of the 9th Conference of the Association for Machine Translation in the Americas: Research Papers
Month:
October 31-November 4
Year:
2010
Address:
Denver, Colorado, USA
Venue:
AMTA
SIG:
Publisher:
Association for Machine Translation in the Americas
Note:
Pages:
Language:
URL:
https://aclanthology.org/2010.amta-papers.33
DOI:
Bibkey:
Cite (ACL):
Minwoo Jeong, Kristina Toutanova, Hisami Suzuki, and Chris Quirk. 2010. A Discriminative Lexicon Model for Complex Morphology. In Proceedings of the 9th Conference of the Association for Machine Translation in the Americas: Research Papers, Denver, Colorado, USA. Association for Machine Translation in the Americas.
Cite (Informal):
A Discriminative Lexicon Model for Complex Morphology (Jeong et al., AMTA 2010)
Copy Citation:
PDF:
https://preview.aclanthology.org/auto-file-uploads/2010.amta-papers.33.pdf