Abstract
Although new corpora are becoming increasingly available for machine translation, only those that belong to the same or similar domains are typically able to improve translation performance. Recently Neural Machine Translation (NMT) has become prominent in the field. However, most of the existing domain adaptation methods only focus on phrase-based machine translation. In this paper, we exploit the NMT’s internal embedding of the source sentence and use the sentence embedding similarity to select the sentences which are close to in-domain data. The empirical adaptation results on the IWSLT English-French and NIST Chinese-English tasks show that the proposed methods can substantially improve NMT performance by 2.4-9.0 BLEU points, outperforming the existing state-of-the-art baseline by 2.3-4.5 BLEU points.- Anthology ID:
- P17-2089
- Volume:
- Proceedings of the 55th Annual Meeting of the Association for Computational Linguistics (Volume 2: Short Papers)
- Month:
- July
- Year:
- 2017
- Address:
- Vancouver, Canada
- Editors:
- Regina Barzilay, Min-Yen Kan
- Venue:
- ACL
- SIG:
- Publisher:
- Association for Computational Linguistics
- Note:
- Pages:
- 560–566
- Language:
- URL:
- https://aclanthology.org/P17-2089
- DOI:
- 10.18653/v1/P17-2089
- Cite (ACL):
- Rui Wang, Andrew Finch, Masao Utiyama, and Eiichiro Sumita. 2017. Sentence Embedding for Neural Machine Translation Domain Adaptation. In Proceedings of the 55th Annual Meeting of the Association for Computational Linguistics (Volume 2: Short Papers), pages 560–566, Vancouver, Canada. Association for Computational Linguistics.
- Cite (Informal):
- Sentence Embedding for Neural Machine Translation Domain Adaptation (Wang et al., ACL 2017)
- PDF:
- https://preview.aclanthology.org/landing_page/P17-2089.pdf