Sunil Yadav
2022
Rakuten’s Participation in WAT 2022: Parallel Dataset Filtering by Leveraging Vocabulary Heterogeneity
Alberto Poncelas
|
Johanes Effendi
|
Ohnmar Htun
|
Sunil Yadav
|
Dongzhe Wang
|
Saurabh Jain
Proceedings of the 9th Workshop on Asian Translation
This paper introduces our neural machine translation system’s participation in the WAT 2022 shared translation task (team ID: sakura). We participated in the Parallel Data Filtering Task. Our approach based on Feature Decay Algorithms achieved +1.4 and +2.4 BLEU points for English to Japanese and Japanese to English respectively compared to the model trained on the full dataset, showing the effectiveness of FDA on in-domain data selection.
2021
Rakuten’s Participation in WAT 2021: Examining the Effectiveness of Pre-trained Models for Multilingual and Multimodal Machine Translation
Raymond Hendy Susanto
|
Dongzhe Wang
|
Sunil Yadav
|
Mausam Jain
|
Ohnmar Htun
Proceedings of the 8th Workshop on Asian Translation (WAT2021)
This paper introduces our neural machine translation systems’ participation in the WAT 2021 shared translation tasks (team ID: sakura). We participated in the (i) NICT-SAP, (ii) Japanese-English multimodal translation, (iii) Multilingual Indic, and (iv) Myanmar-English translation tasks. Multilingual approaches such as mBART (Liu et al., 2020) are capable of pre-training a complete, multilingual sequence-to-sequence model through denoising objectives, making it a great starting point for building multilingual translation systems. Our main focus in this work is to investigate the effectiveness of multilingual finetuning on such a multilingual language model on various translation tasks, including low-resource, multimodal, and mixed-domain translation. We further explore a multimodal approach based on universal visual representation (Zhang et al., 2019) and compare its performance against a unimodal approach based on mBART alone.
Search
Co-authors
- Dongzhe Wang 2
- Ohnmar Htun 2
- Raymond Hendy Susanto 1
- Mausam Jain 1
- Alberto Poncelas 1
- show all...
Venues
- wat2