Adaptive Knowledge Sharing in Multi-Task Learning: Improving Low-Resource Neural Machine Translation
Abstract
Neural Machine Translation (NMT) is notorious for its need for large amounts of bilingual data. An effective approach to compensate for this requirement is Multi-Task Learning (MTL) to leverage different linguistic resources as a source of inductive bias. Current MTL architectures are based on the Seq2Seq transduction, and (partially) share different components of the models among the tasks. However, this MTL approach often suffers from task interference and is not able to fully capture commonalities among subsets of tasks. We address this issue by extending the recurrent units with multiple “blocks” along with a trainable “routing network”. The routing network enables adaptive collaboration by dynamic sharing of blocks conditioned on the task at hand, input, and model state. Empirical evaluation of two low-resource translation tasks, English to Vietnamese and Farsi, show +1 BLEU score improvements compared to strong baselines.- Anthology ID:
- P18-2104
- Volume:
- Proceedings of the 56th Annual Meeting of the Association for Computational Linguistics (Volume 2: Short Papers)
- Month:
- July
- Year:
- 2018
- Address:
- Melbourne, Australia
- Editors:
- Iryna Gurevych, Yusuke Miyao
- Venue:
- ACL
- SIG:
- Publisher:
- Association for Computational Linguistics
- Note:
- Pages:
- 656–661
- Language:
- URL:
- https://aclanthology.org/P18-2104
- DOI:
- 10.18653/v1/P18-2104
- Cite (ACL):
- Poorya Zaremoodi, Wray Buntine, and Gholamreza Haffari. 2018. Adaptive Knowledge Sharing in Multi-Task Learning: Improving Low-Resource Neural Machine Translation. In Proceedings of the 56th Annual Meeting of the Association for Computational Linguistics (Volume 2: Short Papers), pages 656–661, Melbourne, Australia. Association for Computational Linguistics.
- Cite (Informal):
- Adaptive Knowledge Sharing in Multi-Task Learning: Improving Low-Resource Neural Machine Translation (Zaremoodi et al., ACL 2018)
- PDF:
- https://preview.aclanthology.org/add_acl24_videos/P18-2104.pdf