Abstract
In this paper we present an English–Hindi and Hindi–English neural machine translation (NMT) system, submitted to the Translation shared Task organized at WAT 2020. We trained a multilingual NMT system based on transformer architecture. In this paper we show: (i) how effective pre-processing helps to improve performance, (ii) how synthetic data through back-translation from available monolingual data can help in overall translation performance, (iii) how language similarity can aid more onto it. Our submissions ranked 1st in both English to Hindi and Hindi to English translation achieving BLEU 20.80 and 29.59 respectively.- Anthology ID:
- 2020.wat-1.14
- Volume:
- Proceedings of the 7th Workshop on Asian Translation
- Month:
- December
- Year:
- 2020
- Address:
- Suzhou, China
- Editors:
- Toshiaki Nakazawa, Hideki Nakayama, Chenchen Ding, Raj Dabre, Anoop Kunchukuttan, Win Pa Pa, Ondřej Bojar, Shantipriya Parida, Isao Goto, Hidaya Mino, Hiroshi Manabe, Katsuhito Sudoh, Sadao Kurohashi, Pushpak Bhattacharyya
- Venue:
- WAT
- SIG:
- Publisher:
- Association for Computational Linguistics
- Note:
- Pages:
- 122–126
- Language:
- URL:
- https://aclanthology.org/2020.wat-1.14
- DOI:
- Cite (ACL):
- Santanu Pal. 2020. WT: Wipro AI Submissions to the WAT 2020. In Proceedings of the 7th Workshop on Asian Translation, pages 122–126, Suzhou, China. Association for Computational Linguistics.
- Cite (Informal):
- WT: Wipro AI Submissions to the WAT 2020 (Pal, WAT 2020)
- PDF:
- https://preview.aclanthology.org/proper-vol2-ingestion/2020.wat-1.14.pdf
- Data
- WMT 2014