A New Method for Automatic Translation Scoring-HyTER

Daniel Marcu


Abstract
It is common knowledge that translation is an ambiguous, 1-to-n mapping process, but to date, our community has produced no empirical estimates of this ambiguity. We have developed an annotation tool that enables us to create representations that compactly encode an exponential number of correct translations for a sentence. Our findings show that naturally occurring sentences have billions of translations. Having access to such large sets of meaning-equivalent translations enables us to develop a new metric, HyTER, for translation accuracy. We show that our metric provides better estimates of machine and human translation accuracy than alternative evaluation metrics using data from the most recent Open MT NIST evaluation and we discuss how HyTER representations can be used to inform a data-driven inquiry into natural language semantics.
Anthology ID:
2012.amta-government.9
Volume:
Proceedings of the 10th Conference of the Association for Machine Translation in the Americas: Government MT User Program
Month:
October 28-November 1
Year:
2012
Address:
San Diego, California, USA
Venue:
AMTA
SIG:
Publisher:
Association for Machine Translation in the Americas
Note:
Pages:
Language:
URL:
https://aclanthology.org/2012.amta-government.9
DOI:
Bibkey:
Cite (ACL):
Daniel Marcu. 2012. A New Method for Automatic Translation Scoring-HyTER. In Proceedings of the 10th Conference of the Association for Machine Translation in the Americas: Government MT User Program, San Diego, California, USA. Association for Machine Translation in the Americas.
Cite (Informal):
A New Method for Automatic Translation Scoring-HyTER (Marcu, AMTA 2012)
Copy Citation: