Lingua Custodia’s Participation at the WMT 2022 Word-Level Auto-completion Shared Task

Melissa Ailem, Jingshu Liu, Jean-gabriel Barthelemy, Raheel Qader


Abstract
This paper presents Lingua Custodia’s submission to the WMT22 shared task on Word Level Auto-completion (WLAC). We consider two directions, namely German-English and English-German.The WLAC task in Neural Machine Translation (NMT) consists in predicting a target word given few human typed characters, the source sentence to translate, as well as some translation context. Inspired by recent work in terminology control, we propose to treat the human typed sequence as a constraint to predict the right word starting by the latter. To do so, the source side of the training data is augmented with both the constraints and the translation context. In addition, following new advances in WLAC, we use a joint optimization strategy taking into account several types of translation context. The automatic as well as human accuracy obtained with the submitted systems show the effectiveness of the proposed method.
Anthology ID:
2022.wmt-1.118
Volume:
Proceedings of the Seventh Conference on Machine Translation (WMT)
Month:
December
Year:
2022
Address:
Abu Dhabi, United Arab Emirates (Hybrid)
Venue:
WMT
SIG:
Publisher:
Association for Computational Linguistics
Note:
Pages:
1170–1175
Language:
URL:
https://aclanthology.org/2022.wmt-1.118
DOI:
Bibkey:
Cite (ACL):
Melissa Ailem, Jingshu Liu, Jean-gabriel Barthelemy, and Raheel Qader. 2022. Lingua Custodia’s Participation at the WMT 2022 Word-Level Auto-completion Shared Task. In Proceedings of the Seventh Conference on Machine Translation (WMT), pages 1170–1175, Abu Dhabi, United Arab Emirates (Hybrid). Association for Computational Linguistics.
Cite (Informal):
Lingua Custodia’s Participation at the WMT 2022 Word-Level Auto-completion Shared Task (Ailem et al., WMT 2022)
Copy Citation:
PDF:
https://preview.aclanthology.org/ingestion-script-update/2022.wmt-1.118.pdf