Mixing and Matching: Combining Independently Trained Translation Model Components

Taido Purason; Andre Tättar; Mark Fishel

Mixing and Matching: Combining Independently Trained Translation Model Components

Taido Purason, Andre Tättar, Mark Fishel

Abstract

This paper investigates how to combine encoders and decoders of different independently trained NMT models. Combining encoders/decoders is not directly possible since the intermediate representations of any two independent NMT models are different and cannot be combined without modification. To address this, firstly, a dimension adapter is added if the encoder and decoder have different embedding dimensionalities, and secondly, representation adapter layers are added to align the encoder’s representations for the decoder to process. As a proof of concept, this paper looks at many-to-Estonian translation and combines a massively multilingual encoder (NLLB) and a high-quality language-specific decoder. The paper successfully demonstrates that the sentence representations of two independent NMT models can be made compatible without changing the pre-trained components while keeping translation quality from deteriorating. Results show improvements in both translation quality and speed for many-to-one translation over the baseline multilingual model.

Anthology ID:: 2024.moomin-1.5
Volume:: Proceedings of the 1st Workshop on Modular and Open Multilingual NLP (MOOMIN 2024)
Month:: March
Year:: 2024
Address:: St Julians, Malta
Editors:: Raúl Vázquez, Timothee Mickus, Jörg Tiedemann, Ivan Vulić, Ahmet Üstün
Venues:: MOOMIN | WS
SIG:
Publisher:: Association for Computational Linguistics
Note:
Pages:: 44–56
Language:
URL:: https://preview.aclanthology.org/add-emnlp-2024-awards/2024.moomin-1.5/
DOI:
Bibkey:
Cite (ACL):: Taido Purason, Andre Tättar, and Mark Fishel. 2024. Mixing and Matching: Combining Independently Trained Translation Model Components. In Proceedings of the 1st Workshop on Modular and Open Multilingual NLP (MOOMIN 2024), pages 44–56, St Julians, Malta. Association for Computational Linguistics.
Cite (Informal):: Mixing and Matching: Combining Independently Trained Translation Model Components (Purason et al., MOOMIN 2024)
Copy Citation:
PDF:: https://preview.aclanthology.org/add-emnlp-2024-awards/2024.moomin-1.5.pdf

PDF Cite Search Fix data