Improving Document-Level Neural Machine Translation with Domain Adaptation

Sami Ul Haq, Sadaf Abdul Rauf, Arslan Shoukat, Noor-e- Hira


Abstract
Recent studies have shown that translation quality of NMT systems can be improved by providing document-level contextual information. In general sentence-based NMT models are extended to capture contextual information from large-scale document-level corpora which are difficult to acquire. Domain adaptation on the other hand promises adapting components of already developed systems by exploiting limited in-domain data. This paper presents FJWU’s system submission at WNGT, we specifically participated in Document level MT task for German-English translation. Our system is based on context-aware Transformer model developed on top of original NMT architecture by integrating contextual information using attention networks. Our experimental results show providing previous sentences as context significantly improves the BLEU score as compared to a strong NMT baseline. We also studied the impact of domain adaptation on document level translationand were able to improve results by adaptingthe systems according to the testing domain.
Anthology ID:
2020.ngt-1.27
Volume:
Proceedings of the Fourth Workshop on Neural Generation and Translation
Month:
July
Year:
2020
Address:
Online
Venue:
NGT
SIG:
Publisher:
Association for Computational Linguistics
Note:
Pages:
225–231
Language:
URL:
https://aclanthology.org/2020.ngt-1.27
DOI:
10.18653/v1/2020.ngt-1.27
Bibkey:
Cite (ACL):
Sami Ul Haq, Sadaf Abdul Rauf, Arslan Shoukat, and Noor-e- Hira. 2020. Improving Document-Level Neural Machine Translation with Domain Adaptation. In Proceedings of the Fourth Workshop on Neural Generation and Translation, pages 225–231, Online. Association for Computational Linguistics.
Cite (Informal):
Improving Document-Level Neural Machine Translation with Domain Adaptation (Ul Haq et al., NGT 2020)
Copy Citation:
PDF:
https://preview.aclanthology.org/starsem-semeval-split/2020.ngt-1.27.pdf