Terminology-Aware Translation with Constrained Decoding and Large Language Model Prompting

Nikolay Bogoychev; Pinzhen Chen

doi:10.18653/v1/2023.wmt-1.80

Terminology-Aware Translation with Constrained Decoding and Large Language Model Prompting

Abstract

Terminology correctness is important in the downstream application of machine translation, and a prevalent way to ensure this is to inject terminology constraints into a translation system. In our submission to the WMT 2023 terminology translation task, we adopt a translate-then-refine approach which can be domain-independent and requires minimal manual efforts. We annotate random source words with pseudo-terminology translations obtained from word alignment to first train a terminology-aware model. Further, we explore two post-processing methods. First, we use an alignment process to discover whether a terminology constraint has been violated, and if so, we re-decode with the violating word negatively constrained. Alternatively, we leverage a large language model to refine a hypothesis by providing it with terminology constraints. Results show that our terminology-aware model learns to incorporate terminologies effectively, and the large language model refinement process can further improve terminology recall.

Anthology ID:: 2023.wmt-1.80
Volume:: Proceedings of the Eighth Conference on Machine Translation
Month:: December
Year:: 2023
Address:: Singapore
Editors:: Philipp Koehn, Barry Haddow, Tom Kocmi, Christof Monz
Venue:: WMT
SIG:: SIGMT
Publisher:: Association for Computational Linguistics
Note:
Pages:: 890–896
Language:
URL:: https://aclanthology.org/2023.wmt-1.80
DOI:: 10.18653/v1/2023.wmt-1.80
Bibkey:
Cite (ACL):: Nikolay Bogoychev and Pinzhen Chen. 2023. Terminology-Aware Translation with Constrained Decoding and Large Language Model Prompting. In Proceedings of the Eighth Conference on Machine Translation, pages 890–896, Singapore. Association for Computational Linguistics.
Cite (Informal):: Terminology-Aware Translation with Constrained Decoding and Large Language Model Prompting (Bogoychev & Chen, WMT 2023)
Copy Citation:
PDF:: https://preview.aclanthology.org/nschneid-patch-1/2023.wmt-1.80.pdf

PDF Search