Jaewook Lee

2025

pdf bib abs
Does the Emotional Understanding of LVLMs Vary Under High-Stress Environments and Across Different Demographic Attributes?
Jaewook Lee | Yeajin Jang | Oh-Woog Kwon | Harksoo Kim
Proceedings of the 63rd Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers)

According to psychological and neuroscientific research, a high-stress environment can restrict attentional resources and intensify negative affect, thereby impairing the ability to understand emotions. Furthermore, demographic attributes such as race, gender, and age group have been repeatedly reported to cause significant differences in emotional expression and recognition. This study is the first to systematically verify whether these psychological findings observed in humans also apply to the latest Large Vision Language Models (LVLMs). We constructed low-stress versus high-stress environments and generated an image dataset (a total of 540 images) that combines race, gender, and age group. Based on this, we applied the Pretend prompt technique to induce LVLMs to interpret others’ emotions from the standpoint of the assigned environment and persona. An analysis of the models’ emotional understanding ability, using EQ-Bench-based metrics, revealed that (1) under high-stress environments, the accuracy of emotion understanding significantly declined in most LVLMs, and (2) performance disparities were confirmed across race, gender, and age group. These findings suggest that the effects of high-stress and demographic attributes identified in human research may also be reflected in LVLMs.

pdf bib abs
Small Changes, Big Impact: How Manipulating a Few Neurons Can Drastically Alter LLM Aggression
Jaewook Lee | Junseo Jang | Oh-Woog Kwon | Harksoo Kim
Proceedings of the 63rd Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers)

Recent remarkable advances in Large Language Models (LLMs) have led to innovations in various domains such as education, healthcare, and finance, while also raising serious concerns that they can be easily misused for malicious purposes. Most previous research has focused primarily on observing how jailbreak attack techniques bypass safety mechanisms like Reinforcement Learning through Human Feedback (RLHF). However, whether there are neurons within LLMs that directly govern aggression has not been sufficiently investigated. To fill this gap, this study identifies specific neurons (“aggression neurons”) closely related to the expression of aggression and systematically analyzes how manipulating them affects the model’s overall aggression. Specifically, using a large-scale synthetic text corpus (aggressive and non-aggressive), we measure the activation frequency of each neuron, then apply masking and activation techniques to quantitatively evaluate changes in aggression by layer and by manipulation ratio. Experimental results show that, in all models, manipulating only a small number of neurons can increase aggression by up to 33%, and the effect is even more extreme when aggression neurons are concentrated in certain layers. Moreover, even models of the same scale exhibit nonlinear changes in aggression patterns, suggesting that simple external safety measures alone may not be sufficient for complete defense.

pdf bib abs
Do Large Language Models Have “Emotion Neurons”? Investigating the Existence and Role
Jaewook Lee | Woojin Lee | Oh-Woog Kwon | Harksoo Kim
Findings of the Association for Computational Linguistics: ACL 2025

This study comprehensively explores whether there actually exist “emotion neurons” within large language models (LLMs) that selectively process and express certain emotions, and what functional role they play. Drawing on the representative emotion theory of the six basic emotions, we focus on six core emotions. Using synthetic dialogue data labeled with emotions, we identified sets of neurons that exhibit consistent activation patterns for each emotion. As a result, we confirmed that principal neurons handling emotion information do indeed exist within the model, forming distinct groups for each emotion, and that their distribution varies with model size and architectural depth. We then validated the functional significance of these emotion neurons by analyzing whether the prediction accuracy for a specific emotion significantly decreases when those neurons are artificially removed. We observed that in some emotions, the accuracy drops sharply upon neuron removal, while in others, the model’s performance largely remains intact or even improves, presumably due to overlapping and complementary mechanisms among neurons. Furthermore, by examining how prediction accuracy changes depending on which layer range and at what proportion the emotion neurons are masked, we revealed that emotion information is processed in a multilayered and complex manner within the model.

pdf bib abs
CoME: An Unlearning-based Approach to Conflict-free Model Editing
Dahyun Jung | Jaehyung Seo | Jaewook Lee | Chanjun Park | Heuiseok Lim
Proceedings of the 2025 Conference of the Nations of the Americas Chapter of the Association for Computational Linguistics: Human Language Technologies (Volume 1: Long Papers)

Large language models (LLMs) often retain outdated or incorrect information from pre-training, which undermines their reliability. While model editing methods have been developed to address such errors without full re-training, they frequently suffer from knowledge conflicts, where outdated information interferes with new knowledge. In this work, we propose Conflict-free Model Editing (CoME), a novel framework that enhances the accuracy of knowledge updates in LLMs by selectively removing outdated knowledge. CoME leverages unlearning to mitigate knowledge interference, allowing new information to be integrated without compromising relevant linguistic features. Through experiments on GPT-J and LLaMA-3 using Counterfact and ZsRE datasets, we demonstrate that CoME improves both editing accuracy and model reliability when applied to existing editing methods. Our results highlight that the targeted removal of outdated knowledge is crucial for enhancing model editing effectiveness and maintaining the model’s generative performance.

2024

pdf bib abs
Analyzing Key Factors Influencing Emotion Prediction Performance of VLLMs in Conversational Contexts
Jaewook Lee | Yeajin Jang | Hongjin Kim | Woojin Lee | Harksoo Kim
Proceedings of the 2024 Conference on Empirical Methods in Natural Language Processing

Emotional intelligence (EI) in artificial intelligence (AI), which refers to the ability of an AI to understand and respond appropriately to human emotions, has emerged as a crucial research topic. Recent studies have shown that large language models (LLMs) and vision large language models (VLLMs) possess EI and the ability to understand emotional stimuli in the form of text and images, respectively. However, factors influencing the emotion prediction performance of VLLMs in real-world conversational contexts have not been sufficiently explored. This study aims to analyze the key elements affecting the emotion prediction performance of VLLMs in conversational contexts systematically. To achieve this, we reconstructed the MELD dataset, which is based on the popular TV series Friends, and conducted experiments through three sub-tasks: overall emotion tone prediction, character emotion prediction, and contextually appropriate emotion expression selection. We evaluated the performance differences based on various model architectures (e.g., image encoders, modality alignment, and LLMs) and image scopes (e.g., entire scene, person, and facial expression). In addition, we investigated the impact of providing persona information on the emotion prediction performance of the models and analyzed how personality traits and speaking styles influenced the emotion prediction process. We conducted an in-depth analysis of the impact of various other factors, such as gender and regional biases, on the emotion prediction performance of VLLMs. The results revealed that these factors significantly influenced the model performance.

pdf bib abs
Generative Interpretation: Toward Human-Like Evaluation for Educational Question-Answer Pair Generation
Hyeonseok Moon | Jaewook Lee | Sugyeong Eo | Chanjun Park | Jaehyung Seo | Heuiseok Lim
Findings of the Association for Computational Linguistics: EACL 2024

Educational question-answer generation has been extensively researched owing to its practical applicability. However, we have identified a persistent challenge concerning the evaluation of such systems. Existing evaluation methods often fail to produce objective results and instead exhibit a bias towards favoring high similarity to the ground-truth question-answer pairs. In this study, we demonstrate that these evaluation methods yield low human alignment and propose an alternative approach called Generative Interpretation (GI) to achieve more objective evaluations. Through experimental analysis, we reveal that GI outperforms existing evaluation methods in terms of human alignment, and even shows comparable performance with GPT3.5, only with BART-large.

Multiple-choice questions (MCQs) are ubiquitous in almost all levels of education since they are easy to administer, grade, and are a reliable format in assessments and practices. One of the most important aspects of MCQs is the distractors, i.e., incorrect options that are designed to target common errors or misconceptions among real students. To date, the task of crafting high-quality distractors largely remains a labor and time-intensive process for teachers and learning content designers, which has limited scalability. In this work, we study the task of automated distractor generation in the domain of math MCQs and explore a wide variety of large language model (LLM)-based approaches, from in-context learning to fine-tuning. We conduct extensive experiments using a real-world math MCQ dataset and find that although LLMs can generate some mathematically valid distractors, they are less adept at anticipating common errors or misconceptions among real students.

pdf bib abs
KoCommonGEN v2: A Benchmark for Navigating Korean Commonsense Reasoning Challenges in Large Language Models
Jaehyung Seo | Jaewook Lee | Chanjun Park | SeongTae Hong | Seungjun Lee | Heuiseok Lim
Findings of the Association for Computational Linguistics: ACL 2024

The evolution of large language models (LLMs) has culminated in a multitask model paradigm where prompts drive the generation of user-specific outputs. However, this advancement has revealed a critical challenge: LLMs frequently produce outputs against socially acceptable commonsense standards in various scenarios. To address this gap in commonsense reasoning, we present KoCommonGEN v2, a fine-grained benchmark dataset focused on Korean commonsense reasoning. This dataset, enriched with human annotations, comprises multiple-choice questions across seven error categories. These categories include commonsense memorization, numerical commonsense, toxic speech, and more, which are vulnerable to undermining the reliability of LLMs’ commonsense reasoning capabilities. The empirical results present that LLMs struggle with Korean commonsense reasoning. With human accuracy benchmarked at approximately 85%, GPT-4’s performance lags at about 74%, and other LLMs demonstrate an average accuracy of around 42%. Our findings emphasize the need for targeted improvements in Korean commonsense reasoning within LLMs, paving the way for more socially and contextually sensitive AI models.

pdf bib abs
Exploring Automated Keyword Mnemonics Generation with Large Language Models via Overgenerate-and-Rank
Jaewook Lee | Hunter McNichols | Andrew Lan
Findings of the Association for Computational Linguistics: EMNLP 2024

In this paper, we study an under-explored area of language and vocabulary learning: keyword mnemonics, a technique for memorizing vocabulary through memorable associations with a target word via a verbal cue. Typically, creating verbal cues requires extensive human effort and is quite time-consuming, necessitating an automated method that is more scalable. We propose a novel overgenerate-and-rank method via prompting large language models (LLMs) to generate verbal cues and then ranking them according to psycholinguistic measures and takeaways from a pilot user study. To assess cue quality, we conduct both an automated evaluation of imageability and coherence, as well as a human evaluation involving English teachers and learners. Results show that LLM-generated mnemonics are comparable to human-generated ones in terms of imageability, coherence, and perceived usefulness, but there remains plenty of room for improvement due to the diversity in background and preference among language learners.

2023

pdf bib abs
A Framework for Vision-Language Warm-up Tasks in Multimodal Dialogue Models
Jaewook Lee | Seongsik Park | Seong-Heum Park | Hongjin Kim | Harksoo Kim
Proceedings of the 2023 Conference on Empirical Methods in Natural Language Processing

Most research on multimodal open-domain dialogue agents has focused on pretraining and multi-task learning using additional rich datasets beyond a given target dataset. However, methods for exploiting these additional datasets can be quite limited in real-world settings, creating a need for more efficient methods for constructing agents based solely on the target dataset. To address these issues, we present a new learning strategy called vision-language warm-up tasks for multimodal dialogue models (VLAW-MDM). This strategy does not require the use of large pretraining or multi-task datasets but rather relies solely on learning from target data. Moreover, our proposed approach automatically generate captions for images and incorporate them into the model’s input to improve the contextualization of visual information. Using this novel approach, we empirically demonstrate that our learning strategy is effective for limited data and relatively small models. The result show that our method achieved comparable and in some cases superior performance compared to existing state-of-the-art models on various evaluation metrics.

pdf bib abs
CHEF in the Language Kitchen: A Generative Data Augmentation Leveraging Korean Morpheme Ingredients
Jaehyung Seo | Hyeonseok Moon | Jaewook Lee | Sugyeong Eo | Chanjun Park | Heuiseok Lim
Proceedings of the 2023 Conference on Empirical Methods in Natural Language Processing

Korean morphological variations present unique opportunities and challenges in natural language processing (NLP), necessitating an advanced understanding of morpheme-based sentence construction. The complexity of morphological variations allows for diverse sentence forms based on the syntactic-semantic integration of functional morphemes (i.e., affixes) to lexical morphemes (i.e., roots). With this in mind, we propose a method - CHEF, replicating the morphological transformations inherent in sentences based on lexical and functional morpheme combinations through generative data augmentation. CHEF operates using a morpheme blender and a label discriminator, thereby enhancing the diversity of Korean sentence forms by capturing the properties of agglutination while maintaining label consistency. We conduct experiments on Korean multiple classification datasets, improving model performance in full- and few-shot settings. Our proposed method boosts performance beyond the preceding data augmentation methods without incurring external data usage. We demonstrate that our approach achieves comparable results yielded by augmentation techniques that use large language models (LLMs).