Mahendra Data


Fixing paper assignments

  1. Please select all papers that belong to the same person.
  2. Indicate below which author they should be assigned to.
Provide a valid ORCID iD here. This will be used to match future papers to this author.
Provide the name of the school or the university where the author has received or will receive their highest degree (e.g., Ph.D. institution for researchers, or current affiliation for students). This will be used to form the new author page ID, if needed.

TODO: "submit" and "cancel" buttons here


2021

pdf bib
ParaCotta: Synthetic Multilingual Paraphrase Corpora from the Most Diverse Translation Sample Pair
Alham Fikri Aji | Tirana Noor Fatyanosa | Radityo Eko Prasojo | Philip Arthur | Suci Fitriany | Salma Qonitah | Nadhifa Zulfa | Tomi Santoso | Mahendra Data
Proceedings of the 35th Pacific Asia Conference on Language, Information and Computation

pdf bib
To Optimize, or Not to Optimize, That Is the Question: TelU-KU Models for WMT21 Large-Scale Multilingual Machine Translation
Sari Dewi Budiwati | Tirana Fatyanosa | Mahendra Data | Dedy Rahman Wijaya | Patrick Adolf Telnoni | Arie Ardiyanti Suryani | Agus Pratondo | Masayoshi Aritsugi
Proceedings of the Sixth Conference on Machine Translation

We describe TelU-KU models of large-scale multilingual machine translation for five Southeast Asian languages: Javanese, Indonesian, Malay, Tagalog, Tamil, and English. We explore a variation of hyperparameters of flores101_mm100_175M model using random search with 10% of datasets to improve BLEU scores of all thirty language pairs. We submitted two models, TelU-KU-175M and TelU-KU- 175M_HPO, with average BLEU scores of 12.46 and 13.19, respectively. Our models show improvement in most language pairs after optimizing the hyperparameters. We also identified three language pairs that obtained a BLEU score of more than 15 while using less than 70 sentences of the training dataset: Indonesian-Tagalog, Tagalog-Indonesian, and Malay-Tagalog.