The NITS-CNLP System for the Unsupervised MT Task at WMT 2020
Salam Michael Singh, Thoudam Doren Singh, Sivaji Bandyopadhyay
Abstract
We describe NITS-CNLP’s submission to WMT 2020 unsupervised machine translation shared task for German language (de) to Upper Sorbian (hsb) in a constrained setting i.e, using only the data provided by the organizers. We train our unsupervised model using monolingual data from both the languages by jointly pre-training the encoder and decoder and fine-tune using backtranslation loss. The final model uses the source side (de) monolingual data and the target side (hsb) synthetic data as a pseudo-parallel data to train a pseudo-supervised system which is tuned using the provided development set(dev set).- Anthology ID:
- 2020.wmt-1.135
- Volume:
- Proceedings of the Fifth Conference on Machine Translation
- Month:
- November
- Year:
- 2020
- Address:
- Online
- Venue:
- WMT
- SIG:
- SIGMT
- Publisher:
- Association for Computational Linguistics
- Note:
- Pages:
- 1139–1143
- Language:
- URL:
- https://aclanthology.org/2020.wmt-1.135
- DOI:
- Cite (ACL):
- Salam Michael Singh, Thoudam Doren Singh, and Sivaji Bandyopadhyay. 2020. The NITS-CNLP System for the Unsupervised MT Task at WMT 2020. In Proceedings of the Fifth Conference on Machine Translation, pages 1139–1143, Online. Association for Computational Linguistics.
- Cite (Informal):
- The NITS-CNLP System for the Unsupervised MT Task at WMT 2020 (Singh et al., WMT 2020)
- PDF:
- https://preview.aclanthology.org/ingestion-script-update/2020.wmt-1.135.pdf