Latent Group Dropout for Multilingual and Multidomain Machine Translation

Minh-Quang Pham, François Yvon, Josep Crego


Abstract
Multidomain and multilingual machine translation often rely on parameter sharing strategies, where large portions of the network are meant to capture the commonalities of the tasks at hand, while smaller parts are reserved to model the peculiarities of a language or a domain. In adapter-based approaches, these strategies are hardcoded in the network architecture, independent of the similarities between tasks. In this work, we propose a new method to better take advantage of these similarities, using a latent-variable model. We also develop new techniques to train this model end-to-end and report experimental results showing that the learned patterns are both meaningful and yield improved translation performance without any increase of the model size.
Anthology ID:
2022.findings-naacl.189
Volume:
Findings of the Association for Computational Linguistics: NAACL 2022
Month:
July
Year:
2022
Address:
Seattle, United States
Editors:
Marine Carpuat, Marie-Catherine de Marneffe, Ivan Vladimir Meza Ruiz
Venue:
Findings
SIG:
Publisher:
Association for Computational Linguistics
Note:
Pages:
2469–2481
Language:
URL:
https://aclanthology.org/2022.findings-naacl.189
DOI:
10.18653/v1/2022.findings-naacl.189
Bibkey:
Cite (ACL):
Minh-Quang Pham, François Yvon, and Josep Crego. 2022. Latent Group Dropout for Multilingual and Multidomain Machine Translation. In Findings of the Association for Computational Linguistics: NAACL 2022, pages 2469–2481, Seattle, United States. Association for Computational Linguistics.
Cite (Informal):
Latent Group Dropout for Multilingual and Multidomain Machine Translation (Pham et al., Findings 2022)
Copy Citation:
PDF:
https://preview.aclanthology.org/nschneid-patch-4/2022.findings-naacl.189.pdf
Video:
 https://preview.aclanthology.org/nschneid-patch-4/2022.findings-naacl.189.mp4