Yunwon Tae
2023
PePe: Personalized Post-editing Model utilizing User-generated Post-edits
Jihyeon Lee
|
Taehee Kim
|
Yunwon Tae
|
Cheonbok Park
|
Jaegul Choo
Findings of the Association for Computational Linguistics: EACL 2023
Incorporating personal preference is crucial in advanced machine translation tasks. Despite the recent advancement of machine translation, it remains a demanding task to properly reflect personal style. In this paper, we introduce a personalized automatic post-editing framework to address this challenge, which effectively generates sentences considering distinct personal behaviors. To build this framework, we first collect post-editing data that connotes the user preference from a live machine translation system. Specifically, real-world users enter source sentences for translation and edit the machine-translated outputs according to the user’s preferred style. We then propose a model that combines a discriminator module and user-specific parameters on the APE framework. Experimental results show that the proposed method outperforms other baseline models on four different metrics (i.e., BLEU, TER, YiSi-1, and human evaluation).
2021
Unsupervised Neural Machine Translation for Low-Resource Domains via Meta-Learning
Cheonbok Park
|
Yunwon Tae
|
TaeHee Kim
|
Soyoung Yang
|
Mohammad Azam Khan
|
Lucy Park
|
Jaegul Choo
Proceedings of the 59th Annual Meeting of the Association for Computational Linguistics and the 11th International Joint Conference on Natural Language Processing (Volume 1: Long Papers)
Unsupervised machine translation, which utilizes unpaired monolingual corpora as training data, has achieved comparable performance against supervised machine translation. However, it still suffers from data-scarce domains. To address this issue, this paper presents a novel meta-learning algorithm for unsupervised neural machine translation (UNMT) that trains the model to adapt to another domain by utilizing only a small amount of training data. We assume that domain-general knowledge is a significant factor in handling data-scarce domains. Hence, we extend the meta-learning algorithm, which utilizes knowledge learned from high-resource domains, to boost the performance of low-resource UNMT. Our model surpasses a transfer learning-based approach by up to 2-3 BLEU scores. Extensive experimental results show that our proposed algorithm is pertinent for fast adaptation and consistently outperforms other baselines.
Search
Co-authors
- Taehee Kim 2
- Cheonbok Park 2
- Jaegul Choo 2
- Jihyeon Lee 1
- Soyoung Yang 1
- show all...