Hwichan Kim

2025

pdf bib abs
Can Language Neuron Intervention Reduce Non-Target Language Output?
Suchun Xie | Hwichan Kim | Shota Sasaki | Kosuke Yamada | Jun Suzuki
Proceedings of the 8th BlackboxNLP Workshop: Analyzing and Interpreting Neural Networks for NLP

Large language models (LLMs) often fail to generate text in the intended target language, particularly in non-English interactions. Concurrently, recent work has explored Language Neuron Intervention (LNI) as a promising technique for steering output language. In this paper, we re-evaluate LNI in more practical scenarios—specifically with instruction-tuned models and prompts that explicitly specify the target language. Our experiments show that while LNI also shows potential in such practical scenarios, its average effect is limited and unstable across models and tasks, with a 0.83% reduction in undesired language output and a 0.1% improvement in performance. Our further analysis identifies two key factors for LNI’s limitation: (1) LNI affects both the output language and the content semantics, making it hard to control one without affecting the other, which explains the weak performance gains. (2) LNI increases the target language token probabilities, but they often remain below the top-1generation threshold, resulting in failure to generate the target language in most cases. Our results highlight both the potential and limitations of LNI, paving the way for future improvements.

pdf bib abs
A Fair Comparison without Translationese: English vs. Target-language Instructions for Multilingual LLMs
Taisei Enomoto | Hwichan Kim | Zhousi Chen | Mamoru Komachi
Proceedings of the 2025 Conference of the Nations of the Americas Chapter of the Association for Computational Linguistics: Human Language Technologies (Volume 2: Short Papers)

Most large language models are multilingual instruction executors. Prior studies suggested that English instructions are more effective than target-language instructions even for non-English tasks; however, these studies often use datasets and instructions translated from English, which introduce biases known as translationese, hindering an unbiased comparison. To address this issue, we conduct a fair comparison between English and target-language instructions by eliminating translationese effects. Contrary to previous studies, our experiments across several tasks reveal that the advantage of adopting English instructions is not overwhelming. Additionally, we report on the features of generated texts and the instruction-following abilities when using respective instructions.

2024

pdf bib abs
TMU-HIT at MLSP 2024: How Well Can GPT-4 Tackle Multilingual Lexical Simplification?
Taisei Enomoto | Hwichan Kim | Tosho Hirasawa | Yoshinari Nagai | Ayako Sato | Kyotaro Nakajima | Mamoru Komachi
Proceedings of the 19th Workshop on Innovative Use of NLP for Building Educational Applications (BEA 2024)

Lexical simplification (LS) is a process of replacing complex words with simpler alternatives to help readers understand sentences seamlessly. This process is divided into two primary subtasks: assessing word complexities and replacing high-complexity words with simpler alternatives. Employing task-specific supervised data to train models is a prevalent strategy for addressing these subtasks. However, such approach cannot be employed for low-resource languages. Therefore, this paper introduces a multilingual LS pipeline system that does not rely on supervised data. Specifically, we have developed systems based on GPT-4 for each subtask. Our systems demonstrated top-class performance on both tasks in many languages. The results indicate that GPT-4 can effectively assess lexical complexity and simplify complex words in a multilingual context with high quality.

pdf bib abs
Pruning Multilingual Large Language Models for Multilingual Inference
Hwichan Kim | Jun Suzuki | Tosho Hirasawa | Mamoru Komachi
Findings of the Association for Computational Linguistics: EMNLP 2024

Multilingual large language models (MLLMs), trained on multilingual balanced data, demonstrate better zero-shot learning performance in non-English languages compared to large language models trained on English-dominant data. However, the disparity in performance between English and non-English languages remains a challenge yet to be fully addressed. This study introduces a promising direction for enhancing non-English performance through a specialized pruning approach. Specifically, we prune MLLMs using bilingual sentence pairs from English and other languages and empirically demonstrate that this pruning strategy can enhance the MLLMs’ performance in non-English language.

pdf bib abs
A Single Linear Layer Yields Task-Adapted Low-Rank Matrices
Hwichan Kim | Shota Sasaki | Sho Hoshino | Ukyo Honda
Proceedings of the 2024 Joint International Conference on Computational Linguistics, Language Resources and Evaluation (LREC-COLING 2024)

Low-Rank Adaptation (LoRA) is a widely used Parameter-Efficient Fine-Tuning (PEFT) method that updates an initial weight matrix W₀ with a delta matrix 𝛥 W consisted by two low-rank matrices A and B. A previous study suggested that there is correlation between W₀ and 𝛥 W. In this study, we aim to delve deeper into relationships between W₀ and low-rank matrices A and B to further comprehend the behavior of LoRA. In particular, we analyze a conversion matrix that transform W₀ into low-rank matrices, which encapsulates information about the relationships. Our analysis reveals that the conversion matrices are similar across each layer. Inspired by these findings, we hypothesize that a single linear layer, which takes each layer’s W₀ as input, can yield task-adapted low-rank matrices. To confirm this hypothesis, we devise a method named Conditionally Parameterized LoRA (CondLoRA) that updates initial weight matrices with low-rank matrices derived from a single linear layer. Our empirical results show that CondLoRA maintains a performance on par with LoRA, despite the fact that the trainable parameters of CondLoRA are fewer than those of LoRA. Therefore, we conclude that “a single linear layer yields task-adapted low-rank matrices.” The code used in our experiments is available at https://github.com/CyberAgentAILab/CondLoRA.

pdf bib
DejaVu: Disambiguation evaluation dataset for English-JApanese machine translation on VisUal information
Ayako Sato | Tosho Hirasawa | Hwichan Kim | Zhousi Chen | Teruaki Oka | Masato Mita | Mamoru Komachi
Proceedings of the 38th Pacific Asia Conference on Language, Information and Computation

pdf bib
A Survey for LLM Tuning Methods:Classifying Approaches Based on Model Internal Accessibility
Kyotaro Nakajima | Hwichan Kim | Tosho Hirasawa | Taisei Enomoto | Zhousi Chen | Mamoru Komachi
Proceedings of the 38th Pacific Asia Conference on Language, Information and Computation

pdf bib abs
TMU-HIT’s Submission for the WMT24 Quality Estimation Shared Task: Is GPT-4 a Good Evaluator for Machine Translation?
Ayako Sato | Kyotaro Nakajima | Hwichan Kim | Zhousi Chen | Mamoru Komachi
Proceedings of the Ninth Conference on Machine Translation

In machine translation quality estimation (QE), translation quality is evaluated automatically without the need for reference translations. This paper describes our contribution to the sentence-level subtask of Task 1 at the Ninth Machine Translation Conference (WMT24), which predicts quality scores for neural MT outputs without reference translations. We fine-tune GPT-4o mini, a large-scale language model (LLM), with limited data for QE.We report results for the direct assessment (DA) method for four language pairs: English-Gujarati (En-Gu), English-Hindi (En-Hi), English-Tamil (En-Ta), and English-Telugu (En-Te).Experiments under zero-shot, few-shot prompting, and fine-tuning settings revealed significantly low performance in the zero-shot, while fine-tuning achieved accuracy comparable to last year’s best scores. Our system demonstrated the effectiveness of this approach in low-resource language QE, securing 1st place in both En-Gu and En-Hi, and 4th place in En-Ta and En-Te.

2023

pdf bib abs
Does Masked Language Model Pre-training with Artificial Data Improve Low-resource Neural Machine Translation?
Hiroto Tamura | Tosho Hirasawa | Hwichan Kim | Mamoru Komachi
Findings of the Association for Computational Linguistics: EACL 2023

Pre-training masked language models (MLMs) with artificial data has been proven beneficial for several natural language processing tasks such as natural language understanding and summarization; however, it has been less explored for neural machine translation (NMT).A previous study revealed the benefit of transfer learning for NMT in a limited setup, which differs from MLM.In this study, we prepared two kinds of artificial data and compared the translation performance of NMT when pre-trained with MLM.In addition to the random sequences, we created artificial data mimicking token frequency information from the real world. Our results showed that pre-training the models with artificial data by MLM improves translation performance in low-resource situations. Additionally, we found that pre-training on artificial data created considering token frequency information facilitates improved performance.

pdf bib abs
Enhancing Few-shot Cross-lingual Transfer with Target Language Peculiar Examples
Hwichan Kim | Mamoru Komachi
Findings of the Association for Computational Linguistics: ACL 2023

Few-shot cross-lingual transfer, fine-tuning Multilingual Masked Language Model (MMLM) with source language labeled data and a small amount of target language labeled data, provides excellent performance in the target language. However, if no labeled data in the target language are available, they need to be created through human annotations. In this study, we devise a metric to select annotation candidates from an unlabeled data pool that efficiently enhance accuracy for few-shot cross-lingual transfer. It is known that training a model with hard examples is important to improve the model’s performance. Therefore, we first identify examples that MMLM cannot solve in a zero-shot cross-lingual transfer setting and demonstrate that it is hard to predict peculiar examples in the target language, i.e., the examples distant from the source language examples in cross-lingual semantic space of the MMLM.We then choose high peculiarity examples as annotation candidates and perform few-shot cross-lingual transfer. In comprehensive experiments with 20 languages and 6 tasks, we demonstrate that the high peculiarity examples improve the target language accuracy compared to other candidate selection methods proposed in previous studies.

pdf bib
Simultaneous Domain Adaptation of Tokenization and Machine Translation
Taisei Enomoto | Tosho Hirasawa | Hwichan Kim | Teruaki Oka | Mamoru Komachi
Proceedings of the 37th Pacific Asia Conference on Language, Information and Computation

2022

pdf bib abs
Learning How to Translate North Korean through South Korean
Hwichan Kim | Sangwhan Moon | Naoaki Okazaki | Mamoru Komachi
Proceedings of the Thirteenth Language Resources and Evaluation Conference

South and North Korea both use the Korean language. However, Korean NLP research has focused on South Korean only, and existing NLP systems of the Korean language, such as neural machine translation (NMT) models, cannot properly handle North Korean inputs. Training a model using North Korean data is the most straightforward approach to solving this problem, but there is insufficient data to train NMT models. In this study, we create data for North Korean NMT models using a comparable corpus. First, we manually create evaluation data for automatic alignment and machine translation, and then, investigate automatic alignment methods suitable for North Korean. Finally, we show that a model trained by North Korean bilingual data without human annotation significantly boosts North Korean translation accuracy compared to existing South Korean models in zero-shot settings.

2021

pdf bib
Can Monolingual Pre-trained Encoder-Decoder Improve NMT for Distant Language Pairs?
Hwichan Kim | Mamoru Komachi
Proceedings of the 35th Pacific Asia Conference on Language, Information and Computation

pdf bib abs
TMU NMT System with Japanese BART for the Patent task of WAT 2021
Hwichan Kim | Mamoru Komachi
Proceedings of the 8th Workshop on Asian Translation (WAT2021)

In this paper, we introduce our TMU Neural Machine Translation (NMT) system submitted for the Patent task (Korean Japanese and English Japanese) of 8th Workshop on Asian Translation (Nakazawa et al., 2021). Recently, several studies proposed pre-trained encoder-decoder models using monolingual data. One of the pre-trained models, BART (Lewis et al., 2020), was shown to improve translation accuracy via fine-tuning with bilingual data. However, they experimented only Romanian!English translation using English BART. In this paper, we examine the effectiveness of Japanese BART using Japan Patent Office Corpus 2.0. Our experiments indicate that Japanese BART can also improve translation accuracy in both Korean Japanese and English Japanese translations.

2020

pdf bib abs
Zero-shot North Korean to English Neural Machine Translation by Character Tokenization and Phoneme Decomposition
Hwichan Kim | Tosho Hirasawa | Mamoru Komachi
Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics: Student Research Workshop

The primary limitation of North Korean to English translation is the lack of a parallel corpus; therefore, high translation accuracy cannot be achieved. To address this problem, we propose a zero-shot approach using South Korean data, which are remarkably similar to North Korean data. We train a neural machine translation model after tokenizing a South Korean text at the character level and decomposing characters into phonemes. We demonstrate that our method can effectively learn North Korean to English translation and improve the BLEU scores by +1.01 points in comparison with the baseline.

pdf bib abs
Korean-to-Japanese Neural Machine Translation System using Hanja Information
Hwichan Kim | Tosho Hirasawa | Mamoru Komachi
Proceedings of the 7th Workshop on Asian Translation

In this paper, we describe our TMU neural machine translation (NMT) system submitted for the Patent task (Korean→Japanese) of the 7th Workshop on Asian Translation (WAT 2020, Nakazawa et al., 2020). We propose a novel method to train a Korean-to-Japanese translation model. Specifically, we focus on the vocabulary overlap of Korean Hanja words and Japanese Kanji words, and propose strategies to leverage Hanja information. Our experiment shows that Hanja information is effective within a specific domain, leading to an improvement in the BLEU scores by +1.09 points compared to the baseline.