Sentence Weighting for Neural Machine Translation Domain Adaptation

Shiqi Zhang, Deyi Xiong


Abstract
In this paper, we propose a new sentence weighting method for the domain adaptation of neural machine translation. We introduce a domain similarity metric to evaluate the relevance between a sentence and an available entire domain dataset. The similarity of each sentence to the target domain is calculated with various methods. The computed similarity is then integrated into the training objective to weight sentences. The adaptation results on both IWSLT Chinese-English TED task and a task with only synthetic training parallel data show that our sentence weighting method is able to achieve an significant improvement over strong baselines.
Anthology ID:
C18-1269
Volume:
Proceedings of the 27th International Conference on Computational Linguistics
Month:
August
Year:
2018
Address:
Santa Fe, New Mexico, USA
Editors:
Emily M. Bender, Leon Derczynski, Pierre Isabelle
Venue:
COLING
SIG:
Publisher:
Association for Computational Linguistics
Note:
Pages:
3181–3190
Language:
URL:
https://aclanthology.org/C18-1269
DOI:
Bibkey:
Cite (ACL):
Shiqi Zhang and Deyi Xiong. 2018. Sentence Weighting for Neural Machine Translation Domain Adaptation. In Proceedings of the 27th International Conference on Computational Linguistics, pages 3181–3190, Santa Fe, New Mexico, USA. Association for Computational Linguistics.
Cite (Informal):
Sentence Weighting for Neural Machine Translation Domain Adaptation (Zhang & Xiong, COLING 2018)
Copy Citation:
PDF:
https://preview.aclanthology.org/naacl-24-ws-corrections/C18-1269.pdf