IITP-MT System for Gujarati-English News Translation Task at WMT 2019
Sukanta Sen, Kamal Kumar Gupta, Asif Ekbal, Pushpak Bhattacharyya
Abstract
We describe our submission to WMT 2019 News translation shared task for Gujarati-English language pair. We submit constrained systems, i.e, we rely on the data provided for this language pair and do not use any external data. We train Transformer based subword-level neural machine translation (NMT) system using original parallel corpus along with synthetic parallel corpus obtained through back-translation of monolingual data. Our primary systems achieve BLEU scores of 10.4 and 8.1 for Gujarati→English and English→Gujarati, respectively. We observe that incorporating monolingual data through back-translation improves the BLEU score significantly over baseline NMT and SMT systems for this language pair.- Anthology ID:
- W19-5346
- Volume:
- Proceedings of the Fourth Conference on Machine Translation (Volume 2: Shared Task Papers, Day 1)
- Month:
- August
- Year:
- 2019
- Address:
- Florence, Italy
- Venue:
- WMT
- SIG:
- SIGMT
- Publisher:
- Association for Computational Linguistics
- Note:
- Pages:
- 407–411
- Language:
- URL:
- https://aclanthology.org/W19-5346
- DOI:
- 10.18653/v1/W19-5346
- Cite (ACL):
- Sukanta Sen, Kamal Kumar Gupta, Asif Ekbal, and Pushpak Bhattacharyya. 2019. IITP-MT System for Gujarati-English News Translation Task at WMT 2019. In Proceedings of the Fourth Conference on Machine Translation (Volume 2: Shared Task Papers, Day 1), pages 407–411, Florence, Italy. Association for Computational Linguistics.
- Cite (Informal):
- IITP-MT System for Gujarati-English News Translation Task at WMT 2019 (Sen et al., WMT 2019)
- PDF:
- https://preview.aclanthology.org/ingestion-script-update/W19-5346.pdf