NICT Kyoto Submission for the WMT’20 Quality Estimation Task: Intermediate Training for Domain and Task Adaptation

Raphael Rubino


Abstract
This paper describes the NICT Kyoto submission for the WMT’20 Quality Estimation (QE) shared task. We participated in Task 2: Word and Sentence-level Post-editing Effort, which involved Wikipedia data and two translation directions, namely English-to-German and English-to-Chinese. Our approach is based on multi-task fine-tuned cross-lingual language models (XLM), initially pre-trained and further domain-adapted through intermediate training using the translation language model (TLM) approach complemented with a novel self-supervised learning task which aim is to model errors inherent to machine translation outputs. Results obtained on both word and sentence-level QE show that the proposed intermediate training method is complementary to language model domain adaptation and outperforms the fine-tuning only approach.
Anthology ID:
2020.wmt-1.121
Volume:
Proceedings of the Fifth Conference on Machine Translation
Month:
November
Year:
2020
Address:
Online
Venue:
WMT
SIG:
SIGMT
Publisher:
Association for Computational Linguistics
Note:
Pages:
1042–1048
Language:
URL:
https://aclanthology.org/2020.wmt-1.121
DOI:
Bibkey:
Cite (ACL):
Raphael Rubino. 2020. NICT Kyoto Submission for the WMT’20 Quality Estimation Task: Intermediate Training for Domain and Task Adaptation. In Proceedings of the Fifth Conference on Machine Translation, pages 1042–1048, Online. Association for Computational Linguistics.
Cite (Informal):
NICT Kyoto Submission for the WMT’20 Quality Estimation Task: Intermediate Training for Domain and Task Adaptation (Rubino, WMT 2020)
Copy Citation:
PDF:
https://preview.aclanthology.org/auto-file-uploads/2020.wmt-1.121.pdf
Video:
 https://slideslive.com/38939556
Data
WikiMatrix