Saurabh Jain
2022
Rakuten’s Participation in WAT 2022: Parallel Dataset Filtering by Leveraging Vocabulary Heterogeneity
Alberto Poncelas
|
Johanes Effendi
|
Ohnmar Htun
|
Sunil Yadav
|
Dongzhe Wang
|
Saurabh Jain
Proceedings of the 9th Workshop on Asian Translation
This paper introduces our neural machine translation system’s participation in the WAT 2022 shared translation task (team ID: sakura). We participated in the Parallel Data Filtering Task. Our approach based on Feature Decay Algorithms achieved +1.4 and +2.4 BLEU points for English to Japanese and Japanese to English respectively compared to the model trained on the full dataset, showing the effectiveness of FDA on in-domain data selection.
Comparative Snippet Generation
Saurabh Jain
|
Yisong Miao
|
Min-Yen Kan
Proceedings of the Fifth Workshop on e-Commerce and NLP (ECNLP 5)
We model products’ reviews to generate comparative responses consisting of positive and negative experiences regarding the product. Specifically, we generate a single-sentence, comparative response from a given positive and a negative opinion. We contribute the first dataset for this task of Comparative Snippet Generation from contrasting opinions regarding a product, and an analysis of performance of a pre-trained BERT model to generate such snippets.
Search
Co-authors
- Alberto Poncelas 1
- Johanes Effendi 1
- Ohnmar Htun 1
- Sunil Yadav 1
- Dongzhe Wang 1
- show all...