On Instruction-Finetuning Neural Machine Translation Models

Vikas Raunak; Roman Grundkiewicz; Marcin Junczys-Dowmunt

doi:10.18653/v1/2024.wmt-1.114

On Instruction-Finetuning Neural Machine Translation Models

Vikas Raunak, Roman Grundkiewicz, Marcin Junczys-Dowmunt

Abstract

In this work, we introduce instruction finetuning for Neural Machine Translation (NMT) models, which distills instruction following capabilities from Large Language Models (LLMs) into orders-of-magnitude smaller NMT models. Our instruction-finetuning recipe for NMT models enables customization of translations for a limited but disparate set of translation-specific tasks.We show that NMT models are capable of following multiple instructions simultaneously and demonstrate capabilities of zero-shot composition of instructions.We also show that through instruction finetuning, traditionally disparate tasks such as formality-controlled machine translation, multi-domain adaptation as well as multi-modal translations can be tackled jointly by a single instruction finetuned NMT model, at a performance level comparable to LLMs such as GPT-3.5-Turbo.To the best of our knowledge, our work is among the first to demonstrate the instruction-following capabilities of traditional NMT models, which allows for faster, cheaper and more efficient serving of customized translations.

Anthology ID:: 2024.wmt-1.114
Volume:: Proceedings of the Ninth Conference on Machine Translation
Month:: November
Year:: 2024
Address:: Miami, Florida, USA
Editors:: Barry Haddow, Tom Kocmi, Philipp Koehn, Christof Monz
Venues:: WMT | WS
SIG:
Publisher:: Association for Computational Linguistics
Note:
Pages:: 1155–1166
Language:
URL:: https://preview.aclanthology.org/jlcl-multiple-ingestion/2024.wmt-1.114/
DOI:: 10.18653/v1/2024.wmt-1.114
Bibkey:
Cite (ACL):: Vikas Raunak, Roman Grundkiewicz, and Marcin Junczys-Dowmunt. 2024. On Instruction-Finetuning Neural Machine Translation Models. In Proceedings of the Ninth Conference on Machine Translation, pages 1155–1166, Miami, Florida, USA. Association for Computational Linguistics.
Cite (Informal):: On Instruction-Finetuning Neural Machine Translation Models (Raunak et al., WMT 2024)
Copy Citation:
PDF:: https://preview.aclanthology.org/jlcl-multiple-ingestion/2024.wmt-1.114.pdf

PDF Cite Search Fix data