All You Need is Source! A Study on Source-based Quality Estimation for Neural Machine Translation

Jon Cambra, Mara Nunziatini


Abstract
Segment-level Quality Estimation (QE) is an increasingly sought-after task in the Machine Translation (MT) industry. In recent years, it has experienced an impressive evolution not only thanks to the implementation of supervised models using source and hypothesis information, but also through the usage of MT probabilities. This work presents a different approach to QE where only the source segment and the Neural MT (NMT) training data are needed, making possible an approximation to translation quality before inference. Our work is based on the idea that NMT quality at a segment level depends on the similarity degree between the source segment to be translated and the engine’s training data. The features proposed measuring this aspect of data achieve competitive correlations with MT metrics and human judgment and prove to be advantageous for post-editing (PE) prioritization task with domain adapted engines.
Anthology ID:
2022.amta-upg.15
Volume:
Proceedings of the 15th Biennial Conference of the Association for Machine Translation in the Americas (Volume 2: Users and Providers Track and Government Track)
Month:
September
Year:
2022
Address:
Orlando, USA
Venue:
AMTA
SIG:
Publisher:
Association for Machine Translation in the Americas
Note:
Pages:
210–220
Language:
URL:
https://aclanthology.org/2022.amta-upg.15
DOI:
Bibkey:
Cite (ACL):
Jon Cambra and Mara Nunziatini. 2022. All You Need is Source! A Study on Source-based Quality Estimation for Neural Machine Translation. In Proceedings of the 15th Biennial Conference of the Association for Machine Translation in the Americas (Volume 2: Users and Providers Track and Government Track), pages 210–220, Orlando, USA. Association for Machine Translation in the Americas.
Cite (Informal):
All You Need is Source! A Study on Source-based Quality Estimation for Neural Machine Translation (Cambra & Nunziatini, AMTA 2022)
Copy Citation:
PDF:
https://preview.aclanthology.org/ingestion-script-update/2022.amta-upg.15.pdf
Presentation:
 2022.amta-upg.15.Presentation.pdf