The NITS-CNLP System for the Unsupervised MT Task at WMT 2020

Salam Michael Singh, Thoudam Doren Singh, Sivaji Bandyopadhyay


Abstract
We describe NITS-CNLP’s submission to WMT 2020 unsupervised machine translation shared task for German language (de) to Upper Sorbian (hsb) in a constrained setting i.e, using only the data provided by the organizers. We train our unsupervised model using monolingual data from both the languages by jointly pre-training the encoder and decoder and fine-tune using backtranslation loss. The final model uses the source side (de) monolingual data and the target side (hsb) synthetic data as a pseudo-parallel data to train a pseudo-supervised system which is tuned using the provided development set(dev set).
Anthology ID:
2020.wmt-1.135
Volume:
Proceedings of the Fifth Conference on Machine Translation
Month:
November
Year:
2020
Address:
Online
Venue:
WMT
SIG:
SIGMT
Publisher:
Association for Computational Linguistics
Note:
Pages:
1139–1143
Language:
URL:
https://aclanthology.org/2020.wmt-1.135
DOI:
Bibkey:
Cite (ACL):
Salam Michael Singh, Thoudam Doren Singh, and Sivaji Bandyopadhyay. 2020. The NITS-CNLP System for the Unsupervised MT Task at WMT 2020. In Proceedings of the Fifth Conference on Machine Translation, pages 1139–1143, Online. Association for Computational Linguistics.
Cite (Informal):
The NITS-CNLP System for the Unsupervised MT Task at WMT 2020 (Singh et al., WMT 2020)
Copy Citation:
PDF:
https://preview.aclanthology.org/ingestion-script-update/2020.wmt-1.135.pdf
Video:
 https://slideslive.com/38939575