Improving Target-side Lexical Transfer in Multilingual Neural Machine Translation

Luyu Gao, Xinyi Wang, Graham Neubig


Abstract
To improve the performance of Neural Machine Translation (NMT) for low-resource languages (LRL), one effective strategy is to leverage parallel data from a related high-resource language (HRL). However, multilingual data has been found more beneficial for NMT models that translate from the LRL to a target language than the ones that translate into the LRLs. In this paper, we aim to improve the effectiveness of multilingual transfer for NMT models that translate into the LRL, by designing a better decoder word embedding. Extending upon a general-purpose multilingual encoding method Soft Decoupled Encoding (Wang et al., 2019), we propose DecSDE, an efficient character n-gram based embedding specifically designed for the NMT decoder. Our experiments show that DecSDE leads to consistent gains of up to 1.8 BLEU on translation from English to four different languages.
Anthology ID:
2020.findings-emnlp.319
Volume:
Findings of the Association for Computational Linguistics: EMNLP 2020
Month:
November
Year:
2020
Address:
Online
Venue:
Findings
SIG:
Publisher:
Association for Computational Linguistics
Note:
Pages:
3560–3566
Language:
URL:
https://aclanthology.org/2020.findings-emnlp.319
DOI:
10.18653/v1/2020.findings-emnlp.319
Bibkey:
Cite (ACL):
Luyu Gao, Xinyi Wang, and Graham Neubig. 2020. Improving Target-side Lexical Transfer in Multilingual Neural Machine Translation. In Findings of the Association for Computational Linguistics: EMNLP 2020, pages 3560–3566, Online. Association for Computational Linguistics.
Cite (Informal):
Improving Target-side Lexical Transfer in Multilingual Neural Machine Translation (Gao et al., Findings 2020)
Copy Citation:
PDF:
https://preview.aclanthology.org/auto-file-uploads/2020.findings-emnlp.319.pdf