Abstract
Terminology correctness is important in the downstream application of machine translation, and a prevalent way to ensure this is to inject terminology constraints into a translation system. In our submission to the WMT 2023 terminology translation task, we adopt a translate-then-refine approach which can be domain-independent and requires minimal manual efforts. We annotate random source words with pseudo-terminology translations obtained from word alignment to first train a terminology-aware model. Further, we explore two post-processing methods. First, we use an alignment process to discover whether a terminology constraint has been violated, and if so, we re-decode with the violating word negatively constrained. Alternatively, we leverage a large language model to refine a hypothesis by providing it with terminology constraints. Results show that our terminology-aware model learns to incorporate terminologies effectively, and the large language model refinement process can further improve terminology recall.- Anthology ID:
- 2023.wmt-1.80
- Volume:
- Proceedings of the Eighth Conference on Machine Translation
- Month:
- December
- Year:
- 2023
- Address:
- Singapore
- Editors:
- Philipp Koehn, Barry Haddow, Tom Kocmi, Christof Monz
- Venue:
- WMT
- SIG:
- SIGMT
- Publisher:
- Association for Computational Linguistics
- Note:
- Pages:
- 890–896
- Language:
- URL:
- https://aclanthology.org/2023.wmt-1.80
- DOI:
- 10.18653/v1/2023.wmt-1.80
- Cite (ACL):
- Nikolay Bogoychev and Pinzhen Chen. 2023. Terminology-Aware Translation with Constrained Decoding and Large Language Model Prompting. In Proceedings of the Eighth Conference on Machine Translation, pages 890–896, Singapore. Association for Computational Linguistics.
- Cite (Informal):
- Terminology-Aware Translation with Constrained Decoding and Large Language Model Prompting (Bogoychev & Chen, WMT 2023)
- PDF:
- https://preview.aclanthology.org/nschneid-patch-1/2023.wmt-1.80.pdf