Unsupervised Translation Disambiguation for Cross-Domain Statistical Machine Translation

Mei Yang, Katrin Kirchhoff


Abstract
Most attempts at integrating word sense disambiguation with statistical machine translation have focused on supervised disambiguation approaches. These approaches are of limited use when the distribution of the test data differs strongly from that of the training data; however, word sense errors tend to be especially common under these conditions. In this paper we present different approaches to unsupervised word translation disambiguation and apply them to the problem of translating conversational speech under resource-poor training conditions. Both human and automatic evaluation metrics demonstrate significant improvements resulting from our technique.
Anthology ID:
2012.amta-papers.29
Volume:
Proceedings of the 10th Conference of the Association for Machine Translation in the Americas: Research Papers
Month:
October 28-November 1
Year:
2012
Address:
San Diego, California, USA
Venue:
AMTA
SIG:
Publisher:
Association for Machine Translation in the Americas
Note:
Pages:
Language:
URL:
https://aclanthology.org/2012.amta-papers.29
DOI:
Bibkey:
Cite (ACL):
Mei Yang and Katrin Kirchhoff. 2012. Unsupervised Translation Disambiguation for Cross-Domain Statistical Machine Translation. In Proceedings of the 10th Conference of the Association for Machine Translation in the Americas: Research Papers, San Diego, California, USA. Association for Machine Translation in the Americas.
Cite (Informal):
Unsupervised Translation Disambiguation for Cross-Domain Statistical Machine Translation (Yang & Kirchhoff, AMTA 2012)
Copy Citation:
PDF:
https://preview.aclanthology.org/update-css-js/2012.amta-papers.29.pdf