Abstract
Large multilingual Transformer-based machine translation models have had a pivotal role in making translation systems available for hundreds of languages with good zero-shot translation performance. One such example is the universal model with shared encoder-decoder architecture. Additionally, jointly trained language-specific encoder-decoder systems have been proposed for multilingual neural machine translation (NMT) models. This work investigates various knowledge-sharing approaches on the encoder side while keeping the decoder language- or language-group-specific. We propose a novel approach, where we use universal, language-group-specific and language-specific modules to solve the shortcomings of both the universal models and models with language-specific encoders-decoders. Experiments on a multilingual dataset set up to model real-world scenarios, including zero-shot and low-resource translation, show that our proposed models achieve higher translation quality compared to purely universal and language-specific approaches.- Anthology ID:
- 2022.eamt-1.12
- Volume:
- Proceedings of the 23rd Annual Conference of the European Association for Machine Translation
- Month:
- June
- Year:
- 2022
- Address:
- Ghent, Belgium
- Venue:
- EAMT
- SIG:
- Publisher:
- European Association for Machine Translation
- Note:
- Pages:
- 91–100
- Language:
- URL:
- https://aclanthology.org/2022.eamt-1.12
- DOI:
- Cite (ACL):
- Taido Purason and Andre Tättar. 2022. Multilingual Neural Machine Translation With the Right Amount of Sharing. In Proceedings of the 23rd Annual Conference of the European Association for Machine Translation, pages 91–100, Ghent, Belgium. European Association for Machine Translation.
- Cite (Informal):
- Multilingual Neural Machine Translation With the Right Amount of Sharing (Purason & Tättar, EAMT 2022)
- PDF:
- https://preview.aclanthology.org/ingestion-script-update/2022.eamt-1.12.pdf