Lightly Supervised Quality Estimation

Matthias Sperber, Graham Neubig, Jan Niehues, Sebastian Stüker, Alex Waibel


Abstract
Evaluating the quality of output from language processing systems such as machine translation or speech recognition is an essential step in ensuring that they are sufficient for practical use. However, depending on the practical requirements, evaluation approaches can differ strongly. Often, reference-based evaluation measures (such as BLEU or WER) are appealing because they are cheap and allow rapid quantitative comparison. On the other hand, practitioners often focus on manual evaluation because they must deal with frequently changing domains and quality standards requested by customers, for which reference-based evaluation is insufficient or not possible due to missing in-domain reference data (Harris et al., 2016). In this paper, we attempt to bridge this gap by proposing a framework for lightly supervised quality estimation. We collect manually annotated scores for a small number of segments in a test corpus or document, and combine them with automatically predicted quality scores for the remaining segments to predict an overall quality estimate. An evaluation shows that our framework estimates quality more reliably than using fully automatic quality estimation approaches, while keeping annotation effort low by not requiring full references to be available for the particular domain.
Anthology ID:
C16-1292
Volume:
Proceedings of COLING 2016, the 26th International Conference on Computational Linguistics: Technical Papers
Month:
December
Year:
2016
Address:
Osaka, Japan
Venue:
COLING
SIG:
Publisher:
The COLING 2016 Organizing Committee
Note:
Pages:
3103–3113
Language:
URL:
https://aclanthology.org/C16-1292
DOI:
Bibkey:
Cite (ACL):
Matthias Sperber, Graham Neubig, Jan Niehues, Sebastian Stüker, and Alex Waibel. 2016. Lightly Supervised Quality Estimation. In Proceedings of COLING 2016, the 26th International Conference on Computational Linguistics: Technical Papers, pages 3103–3113, Osaka, Japan. The COLING 2016 Organizing Committee.
Cite (Informal):
Lightly Supervised Quality Estimation (Sperber et al., COLING 2016)
Copy Citation:
PDF:
https://preview.aclanthology.org/update-css-js/C16-1292.pdf
Data
WMT 2015