Regularization techniques for fine-tuning in neural machine translation

Antonio Valerio Miceli Barone, Barry Haddow, Ulrich Germann, Rico Sennrich

[How to correct problems with metadata yourself]


Abstract
We investigate techniques for supervised domain adaptation for neural machine translation where an existing model trained on a large out-of-domain dataset is adapted to a small in-domain dataset. In this scenario, overfitting is a major challenge. We investigate a number of techniques to reduce overfitting and improve transfer learning, including regularization techniques such as dropout and L2-regularization towards an out-of-domain prior. In addition, we introduce tuneout, a novel regularization technique inspired by dropout. We apply these techniques, alone and in combination, to neural machine translation, obtaining improvements on IWSLT datasets for English→German and English→Russian. We also investigate the amounts of in-domain training data needed for domain adaptation in NMT, and find a logarithmic relationship between the amount of training data and gain in BLEU score.
Anthology ID:
D17-1156
Volume:
Proceedings of the 2017 Conference on Empirical Methods in Natural Language Processing
Month:
September
Year:
2017
Address:
Copenhagen, Denmark
Editors:
Martha Palmer, Rebecca Hwa, Sebastian Riedel
Venue:
EMNLP
SIG:
SIGDAT
Publisher:
Association for Computational Linguistics
Note:
Pages:
1489–1494
Language:
URL:
https://aclanthology.org/D17-1156
DOI:
10.18653/v1/D17-1156
Bibkey:
Cite (ACL):
Antonio Valerio Miceli Barone, Barry Haddow, Ulrich Germann, and Rico Sennrich. 2017. Regularization techniques for fine-tuning in neural machine translation. In Proceedings of the 2017 Conference on Empirical Methods in Natural Language Processing, pages 1489–1494, Copenhagen, Denmark. Association for Computational Linguistics.
Cite (Informal):
Regularization techniques for fine-tuning in neural machine translation (Miceli Barone et al., EMNLP 2017)
Copy Citation:
PDF:
https://preview.aclanthology.org/teach-a-man-to-fish/D17-1156.pdf