Unsupervised Quality Estimation for Neural Machine Translation

Marina Fomicheva, Shuo Sun, Lisa Yankovskaya, Frédéric Blain, Francisco Guzmán, Mark Fishel, Nikolaos Aletras, Vishrav Chaudhary, Lucia Specia


Abstract
Quality Estimation (QE) is an important component in making Machine Translation (MT) useful in real-world applications, as it is aimed to inform the user on the quality of the MT output at test time. Existing approaches require large amounts of expert annotated data, computation, and time for training. As an alternative, we devise an unsupervised approach to QE where no training or access to additional resources besides the MT system itself is required. Different from most of the current work that treats the MT system as a black box, we explore useful information that can be extracted from the MT system as a by-product of translation. By utilizing methods for uncertainty quantification, we achieve very good correlation with human judgments of quality, rivaling state-of-the-art supervised QE models. To evaluate our approach we collect the first dataset that enables work on both black-box and glass-box approaches to QE.
Anthology ID:
2020.tacl-1.35
Volume:
Transactions of the Association for Computational Linguistics, Volume 8
Month:
Year:
2020
Address:
Cambridge, MA
Editors:
Mark Johnson, Brian Roark, Ani Nenkova
Venue:
TACL
SIG:
Publisher:
MIT Press
Note:
Pages:
539–555
Language:
URL:
https://aclanthology.org/2020.tacl-1.35
DOI:
10.1162/tacl_a_00330
Bibkey:
Cite (ACL):
Marina Fomicheva, Shuo Sun, Lisa Yankovskaya, Frédéric Blain, Francisco Guzmán, Mark Fishel, Nikolaos Aletras, Vishrav Chaudhary, and Lucia Specia. 2020. Unsupervised Quality Estimation for Neural Machine Translation. Transactions of the Association for Computational Linguistics, 8:539–555.
Cite (Informal):
Unsupervised Quality Estimation for Neural Machine Translation (Fomicheva et al., TACL 2020)
Copy Citation:
PDF:
https://preview.aclanthology.org/naacl24-info/2020.tacl-1.35.pdf