Fast Gated Neural Domain Adaptation: Language Model as a Case Study

Jian Zhang, Xiaofeng Wu, Andy Way, Qun Liu


Abstract
Neural network training has been shown to be advantageous in many natural language processing applications, such as language modelling or machine translation. In this paper, we describe in detail a novel domain adaptation mechanism in neural network training. Instead of learning and adapting the neural network on millions of training sentences – which can be very time-consuming or even infeasible in some cases – we design a domain adaptation gating mechanism which can be used in recurrent neural networks and quickly learn the out-of-domain knowledge directly from the word vector representations with little speed overhead. In our experiments, we use the recurrent neural network language model (LM) as a case study. We show that the neural LM perplexity can be reduced by 7.395 and 12.011 using the proposed domain adaptation mechanism on the Penn Treebank and News data, respectively. Furthermore, we show that using the domain-adapted neural LM to re-rank the statistical machine translation n-best list on the French-to-English language pair can significantly improve translation quality.
Anthology ID:
C16-1131
Volume:
Proceedings of COLING 2016, the 26th International Conference on Computational Linguistics: Technical Papers
Month:
December
Year:
2016
Address:
Osaka, Japan
Venue:
COLING
SIG:
Publisher:
The COLING 2016 Organizing Committee
Note:
Pages:
1386–1397
Language:
URL:
https://aclanthology.org/C16-1131
DOI:
Bibkey:
Cite (ACL):
Jian Zhang, Xiaofeng Wu, Andy Way, and Qun Liu. 2016. Fast Gated Neural Domain Adaptation: Language Model as a Case Study. In Proceedings of COLING 2016, the 26th International Conference on Computational Linguistics: Technical Papers, pages 1386–1397, Osaka, Japan. The COLING 2016 Organizing Committee.
Cite (Informal):
Fast Gated Neural Domain Adaptation: Language Model as a Case Study (Zhang et al., COLING 2016)
Copy Citation:
PDF:
https://preview.aclanthology.org/update-css-js/C16-1131.pdf
Data
Penn Treebank