2025
pdf
bib
abs
UB_Tel-U at SemEval-2025 Task 11: Emotions Without Borders - A Unified Framework for Multilingual Classification Using Augmentation and Ensemble
Tirana Noor Fatyanosa
|
Putra Pandu Adikara
|
Rochmanu Erfitra
|
Muhammad Dikna
|
Sari Dewi Budiwati
|
Cahyana Cahyana
Proceedings of the 19th International Workshop on Semantic Evaluation (SemEval-2025)
In this SemEval 2025 Task 11 paper, we tackled three tracks: Multi-label Emotion Detection, Emotion Intensity, and Cross-lingual Emotion Detection. Our approach harnesses diverse external corpora and robust data augmentation techniques across Spanish, English, and Arabic, enhancing both the diversity and resilience of the dataset. Instead of developing separate models for each language, we merge the data into a unified multilingual dataset, enabling our model to learn cross-lingual patterns and relationships simultaneously. Our ensemble architecture integrates the multilingual strengths of XLM-RoBERTa, a zero-shot classification capability via LLaMA 3, and a specialized pretrained model fine-tuned on English emotion classification. Notably, our system achieved strong performance, ranking 13th for Afrikaans (afr) in Track A, 13th for Amharic (amh) in Track B, and 4th for Hindi (hin) in Track C.
2021
pdf
bib
abs
To Optimize, or Not to Optimize, That Is the Question: TelU-KU Models for WMT21 Large-Scale Multilingual Machine Translation
Sari Dewi Budiwati
|
Tirana Fatyanosa
|
Mahendra Data
|
Dedy Rahman Wijaya
|
Patrick Adolf Telnoni
|
Arie Ardiyanti Suryani
|
Agus Pratondo
|
Masayoshi Aritsugi
Proceedings of the Sixth Conference on Machine Translation
We describe TelU-KU models of large-scale multilingual machine translation for five Southeast Asian languages: Javanese, Indonesian, Malay, Tagalog, Tamil, and English. We explore a variation of hyperparameters of flores101_mm100_175M model using random search with 10% of datasets to improve BLEU scores of all thirty language pairs. We submitted two models, TelU-KU-175M and TelU-KU- 175M_HPO, with average BLEU scores of 12.46 and 13.19, respectively. Our models show improvement in most language pairs after optimizing the hyperparameters. We also identified three language pairs that obtained a BLEU score of more than 15 while using less than 70 sentences of the training dataset: Indonesian-Tagalog, Tagalog-Indonesian, and Malay-Tagalog.
2019
pdf
bib
abs
DBMS-KU Interpolation for WMT19 News Translation Task
Sari Dewi Budiwati
|
Al Hafiz Akbar Maulana Siagian
|
Tirana Noor Fatyanosa
|
Masayoshi Aritsugi
Proceedings of the Fourth Conference on Machine Translation (Volume 2: Shared Task Papers, Day 1)
This paper presents the participation of DBMS-KU Interpolation system in WMT19 shared task, namely, Kazakh-English language pair. We examine the use of interpolation method using a different language model order. Our Interpolation system combines a direct translation with Russian as a pivot language. We use 3-gram and 5-gram language model orders to perform the language translation in this work. To reduce noise in the pivot translation process, we prune the phrase table of source-pivot and pivot-target. Our experimental results show that our Interpolation system outperforms the Baseline in terms of BLEU-cased score by +0.5 and +0.1 points in Kazakh-English and English-Kazakh, respectively. In particular, using the 5-gram language model order in our system could obtain better BLEU-cased score than utilizing the 3-gram one. Interestingly, we found that by employing the Interpolation system could reduce the perplexity score of English-Kazakh when using 3-gram language model order.