Pruning-then-Expanding Model for Domain Adaptation of Neural Machine Translation

Shuhao Gu, Yang Feng, Wanying Xie


Abstract
Domain Adaptation is widely used in practical applications of neural machine translation, which aims to achieve good performance on both general domain and in-domain data. However, the existing methods for domain adaptation usually suffer from catastrophic forgetting, large domain divergence, and model explosion. To address these three problems, we propose a method of “divide and conquer” which is based on the importance of neurons or parameters for the translation model. In this method, we first prune the model and only keep the important neurons or parameters, making them responsible for both general-domain and in-domain translation. Then we further train the pruned model supervised by the original whole model with knowledge distillation. Last we expand the model to the original size and fine-tune the added parameters for the in-domain translation. We conducted experiments on different language pairs and domains and the results show that our method can achieve significant improvements compared with several strong baselines.
Anthology ID:
2021.naacl-main.308
Volume:
Proceedings of the 2021 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies
Month:
June
Year:
2021
Address:
Online
Editors:
Kristina Toutanova, Anna Rumshisky, Luke Zettlemoyer, Dilek Hakkani-Tur, Iz Beltagy, Steven Bethard, Ryan Cotterell, Tanmoy Chakraborty, Yichao Zhou
Venue:
NAACL
SIG:
Publisher:
Association for Computational Linguistics
Note:
Pages:
3942–3952
Language:
URL:
https://aclanthology.org/2021.naacl-main.308
DOI:
10.18653/v1/2021.naacl-main.308
Bibkey:
Cite (ACL):
Shuhao Gu, Yang Feng, and Wanying Xie. 2021. Pruning-then-Expanding Model for Domain Adaptation of Neural Machine Translation. In Proceedings of the 2021 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, pages 3942–3952, Online. Association for Computational Linguistics.
Cite (Informal):
Pruning-then-Expanding Model for Domain Adaptation of Neural Machine Translation (Gu et al., NAACL 2021)
Copy Citation:
PDF:
https://preview.aclanthology.org/nschneid-patch-5/2021.naacl-main.308.pdf
Video:
 https://preview.aclanthology.org/nschneid-patch-5/2021.naacl-main.308.mp4
Code
 ictnlp/PTE-NMT