Taisei Enomoto


2025

pdf bib
A Fair Comparison without Translationese: English vs. Target-language Instructions for Multilingual LLMs
Taisei Enomoto | Hwichan Kim | Zhousi Chen | Mamoru Komachi
Proceedings of the 2025 Conference of the Nations of the Americas Chapter of the Association for Computational Linguistics: Human Language Technologies (Volume 2: Short Papers)

Most large language models are multilingual instruction executors. Prior studies suggested that English instructions are more effective than target-language instructions even for non-English tasks; however, these studies often use datasets and instructions translated from English, which introduce biases known as translationese, hindering an unbiased comparison. To address this issue, we conduct a fair comparison between English and target-language instructions by eliminating translationese effects. Contrary to previous studies, our experiments across several tasks reveal that the advantage of adopting English instructions is not overwhelming. Additionally, we report on the features of generated texts and the instruction-following abilities when using respective instructions.

2024

pdf bib
TMU-HIT at MLSP 2024: How Well Can GPT-4 Tackle Multilingual Lexical Simplification?
Taisei Enomoto | Hwichan Kim | Tosho Hirasawa | Yoshinari Nagai | Ayako Sato | Kyotaro Nakajima | Mamoru Komachi
Proceedings of the 19th Workshop on Innovative Use of NLP for Building Educational Applications (BEA 2024)

Lexical simplification (LS) is a process of replacing complex words with simpler alternatives to help readers understand sentences seamlessly. This process is divided into two primary subtasks: assessing word complexities and replacing high-complexity words with simpler alternatives. Employing task-specific supervised data to train models is a prevalent strategy for addressing these subtasks. However, such approach cannot be employed for low-resource languages. Therefore, this paper introduces a multilingual LS pipeline system that does not rely on supervised data. Specifically, we have developed systems based on GPT-4 for each subtask. Our systems demonstrated top-class performance on both tasks in many languages. The results indicate that GPT-4 can effectively assess lexical complexity and simplify complex words in a multilingual context with high quality.

pdf bib
A Survey for LLM Tuning Methods:Classifying Approaches Based on Model Internal Accessibility
Kyotaro Nakajima | Hwichan Kim | Tosho Hirasawa | Taisei Enomoto | Zhousi Chen | Mamoru Komachi
Proceedings of the 38th Pacific Asia Conference on Language, Information and Computation

2023

pdf bib
Simultaneous Domain Adaptation of Tokenization and Machine Translation
Taisei Enomoto | Tosho Hirasawa | Hwichan Kim | Teruaki Oka | Mamoru Komachi
Proceedings of the 37th Pacific Asia Conference on Language, Information and Computation