Nakyeong Yang
2026
Persona Switch: Mixing Distinct Perspectives in Decoding Time
Junseok Kim | Nakyeong Yang | Kyomin Jung
Findings of the Association for Computational Linguistics: EACL 2026
Junseok Kim | Nakyeong Yang | Kyomin Jung
Findings of the Association for Computational Linguistics: EACL 2026
Role-play prompting is known to steer the behavior of language models by injecting a persona into the prompt, improving their zero-shot reasoning capabilities. However, such improvements are inconsistent across different tasks or instances. This inconsistency suggests that zero-shot and role-play prompting may offer complementary strengths rather than one being universally superior. Building on this insight, we propose **Persona Switch**, a novel decoding method that dynamically combines the benefits of both prompting strategies. Our method proceeds step-by-step, selecting the better output between zero-shot and role-play prompting at each step by comparing their output confidence, as measured by the logit gap. Experiments with widely-used LLMs demonstrate that Persona Switch consistently outperforms competitive baselines, achieving up to 5.13% accuracy improvement. Furthermore, we show that output confidence serves as an informative measure for selecting the more reliable output.
How Training Data Shapes the Use of Parametric and In-Context Knowledge in Language Models
Minsung Kim | Dong-Kyum Kim | Jea Kwon | Nakyeong Yang | Kyomin Jung | Meeyoung Cha
Proceedings of the 64th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers)
Minsung Kim | Dong-Kyum Kim | Jea Kwon | Nakyeong Yang | Kyomin Jung | Meeyoung Cha
Proceedings of the 64th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers)
Large language models leverage both parametric knowledge acquired during pretraining and in-context knowledge provided at inference time. Crucially, when these sources conflict, models arbitrate based on their internal confidence, preferring parametric knowledge for high-confidence facts while deferring to context for less familiar ones. However, the training conditions that give rise to these fundamental behaviors remain unclear. Here we conduct controlled experiments using synthetic corpora to identify the specific data properties that shape knowledge utilization. Our results reveal a counterintuitive finding: the robust, balanced use of both knowledge sources is an emergent property that requires the co-occurrence of three factors typically considered detrimental, including (i) intra-document repetition, (ii) a moderate degree of intra-document inconsistency, and (iii) a skewed knowledge distribution. We further show that these dynamics arise in real-world language model pretraining and analyze how post-training procedures reshape arbitration strategies. Together, our findings provide empirical guidance for designing training data that supports the reliable integration of parametric and in-context knowledge in language models.
Rethinking Post-Unlearning Behavior of Large Vision-Language Models
Minsung Kim | Nakyeong Yang | Kyomin Jung
Findings of the Association for Computational Linguistics: ACL 2026
Minsung Kim | Nakyeong Yang | Kyomin Jung
Findings of the Association for Computational Linguistics: ACL 2026
Large Vision-Language Models (LVLMs) can recognize individuals in images and disclose sensitive personal information about them, raising critical privacy concerns. Machine unlearning aims to remove such knowledge from the model. However, existing methods rarely prescribe what the model should output in place of the forgotten content, leading to Unlearning Aftermaths: degenerate, hallucinated, or excessively refused responses. We argue that, especially for generative LVLMs, it is crucial to consider the quality and informativeness of post-unlearning responses rather than relying solely on naive suppression. To address this, we introduce a new unlearning task for LVLMs that requires models to provide privacy-preserving yet informative and visually grounded responses. We also propose PUBG, a novel unlearning method that explicitly guides post-unlearning behavior toward a desirable output distribution. Experiments show that, while existing methods suffer from Unlearning Aftermaths despite successfully preventing privacy violations, PUBG effectively mitigates these issues, generating visually grounded and informative responses without privacy leakage for forgotten targets.
Reliability-Aware Adaptive Self-Consistency for Efficient Sampling in LLM Reasoning
Junseok Kim | Nakyeong Yang | Kyungmin Min | Kyomin Jung
Findings of the Association for Computational Linguistics: ACL 2026
Junseok Kim | Nakyeong Yang | Kyungmin Min | Kyomin Jung
Findings of the Association for Computational Linguistics: ACL 2026
Self-Consistency improves reasoning reliability through multi-sample aggregation, but incurs substantial inference cost. Adaptive self-consistency methods mitigate this issue by adjusting the sampling budget; however, they rely on count-based stopping rules that treat all responses equally, often leading to unnecessary sampling. We propose Reliability-Aware Adaptive Self-Consistency (), which addresses this limitation by reframing adaptive sampling from response counting to evidence sufficiency, leveraging response-level confidence for principled information aggregation. operates in two stages: a single-sample decision stage that resolves instances confidently answerable from a single response, and a reliability-aware accumulation stage that aggregates responses by jointly leveraging their frequency and confidence. Across five models and four datasets, consistently achieves the best accuracy-cost trade-off compared to existing baselines, yielding improved inference efficiency across model scales from 3B to 27B parameters. As a concrete example, reduces inference cost by up to 70% relative to self-consistency while preserving accuracy on GSM8K using Gemma-3-4B-it.
2025
FaithUn: Toward Faithful Forgetting in Language Models by Investigating the Interconnectedness of Knowledge
Nakyeong Yang | Minsung Kim | Seunghyun Yoon | Joongbo Shin | Kyomin Jung
Proceedings of the 2025 Conference on Empirical Methods in Natural Language Processing
Nakyeong Yang | Minsung Kim | Seunghyun Yoon | Joongbo Shin | Kyomin Jung
Proceedings of the 2025 Conference on Empirical Methods in Natural Language Processing
Various studies have attempted to remove sensitive or private knowledge from a language model to prevent its unauthorized exposure. However, prior studies have overlooked the inherent complexity and interconnectedness of knowledge, which requires careful examination. To resolve this problem, we first define a new concept called superficial unlearning, which refers to the phenomenon where an unlearning method either fails to erase the interconnected knowledge it should remove or unintentionally erases irrelevant knowledge. Based on the definition, we introduce a novel benchmark, FaithUn, to analyze and evaluate the faithfulness of unlearning in real-world knowledge QA settings. Furthermore, we propose a novel unlearning method, KLUE, which updates only knowledge-related neurons to achieve faithful unlearning. KLUE leverages a regularized explainability method to localize contextual knowledge neurons, updating only these neurons using carefully selected unforgotten samples. Experimental results demonstrate that existing unlearning methods fail to ensure faithful unlearning, while our method shows significant effectiveness in real-world QA unlearning.
Avoidance Decoding for Diverse Multi-Branch Story Generation
Kyeongman Park | Nakyeong Yang | Kyomin Jung
Proceedings of the 2025 Conference on Empirical Methods in Natural Language Processing
Kyeongman Park | Nakyeong Yang | Kyomin Jung
Proceedings of the 2025 Conference on Empirical Methods in Natural Language Processing
Large Language Models (LLMs) often generate repetitive and monotonous outputs, especially in tasks like story generation, due to limited creative diversity when given the same input prompt. To address this challenge, we propose a novel decoding strategy, ***Avoidance Decoding***, that modifies token logits by penalizing similarity to previously generated outputs, thereby encouraging more diverse multi-branch stories. This penalty adaptively balances two similarity measures: (1) Concept-level Similarity Penalty, which is prioritized in early stages to diversify initial story concepts, and (2) Narrative-level Similarity Penalty, which is increasingly emphasized later to ensure natural yet diverse plot development. Notably, our method achieves up to **2.6** times higher output diversity and reduces repetition by an average of 30% compared to strong baselines, while effectively mitigating text degeneration. Furthermore, we reveal that our method activates a broader range of neurons, demonstrating that it leverages the model’s intrinsic creative capacity.
Persona is a Double-Edged Sword: Rethinking the Impact of Role-play Prompts in Zero-shot Reasoning Tasks
Junseok Kim | Nakyeong Yang | Kyomin Jung
Proceedings of the 14th International Joint Conference on Natural Language Processing and the 4th Conference of the Asia-Pacific Chapter of the Association for Computational Linguistics
Junseok Kim | Nakyeong Yang | Kyomin Jung
Proceedings of the 14th International Joint Conference on Natural Language Processing and the 4th Conference of the Asia-Pacific Chapter of the Association for Computational Linguistics
Recent studies have shown that prompting large language models (LLMs) with role-playing personas can enhance their reasoning capabilities. While the benefits of role-playing personas in reasoning tasks are widely recognized, it remains uncertain whether a persona aligned with the given dataset can consistently achieve these improvements. In this work, we empirically investigate the potential drawbacks of using dataset-aligned personas (referred to as **coarsely aligned personas**) and introduce Jekyll & Hyde, a novel framework that enhances reasoning robustness by ensembling solutions from both role-playing and neutral (non-persona) prompts.Jekyll & Hyde first predicts an instance-specific persona tailored to each query using an LLM, then generates answers with both persona and neutral prompts, and finally selects the superior output through an LLM-based evaluator.Experimental results claim that across twelve widely used natural language reasoning datasets and three backbone large language models, Jekyll & Hyde consistently outperforms single-perspective LLMs, achieving an average accuracy gain of **9.98%** on GPT‐4.We further demonstrate that using instance‐aligned personas yields more accurate and stable performance than using dataset-aligned personas.
2024
Mitigating Biases for Instruction-following Language Models via Bias Neurons Elimination
Nakyeong Yang | Taegwan Kang | Stanley Jungkyu Choi | Honglak Lee | Kyomin Jung
Proceedings of the 62nd Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers)
Nakyeong Yang | Taegwan Kang | Stanley Jungkyu Choi | Honglak Lee | Kyomin Jung
Proceedings of the 62nd Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers)
Instruction-following language models often show undesirable biases. These undesirable biases may be accelerated in the real-world usage of language models, where a wide range of instructions is used through zero-shot example prompting. To solve this problem, we first define the bias neuron, which significantly affects biased outputs, and prove its existence empirically. Furthermore, we propose a novel and practical bias mitigation method, CRISPR, to eliminate bias neurons of language models in instruction-following settings. CRISPR automatically determines biased outputs and categorizes neurons that affect the biased outputs as bias neurons using an explainability method. Experimental results demonstrate the effectiveness of our method in mitigating biases under zero-shot instruction-following settings without losing the model’s task performance and existing knowledge. The experimental results reveal the generalizability of our method as it shows robustness under various instructions and datasets. Surprisingly, our method can mitigate the bias in language models by eliminating only a few neurons (at least three).
2023
Task-specific Compression for Multi-task Language Models using Attribution-based Pruning
Nakyeong Yang | Yunah Jang | Hwanhee Lee | Seohyeong Jeong | Kyomin Jung
Findings of the Association for Computational Linguistics: EACL 2023
Nakyeong Yang | Yunah Jang | Hwanhee Lee | Seohyeong Jeong | Kyomin Jung
Findings of the Association for Computational Linguistics: EACL 2023
Multi-task language models show outstanding performance for various natural language understanding tasks with only a single model. However, these language models inevitably utilize an unnecessarily large number of model parameters, even when used only for a specific task. In this paper, we propose a novel training-free compression method for multi-task language models using pruning method. Specifically, we use an attribution method to determine which neurons are essential for performing a specific task. We task-specifically prune unimportant neurons and leave only task-specific parameters. Furthermore, we extend our method to be applicable in both low-resource and unsupervised settings. Since our compression method is training-free, it uses little computing resources and does not update the pre-trained parameters of language models, reducing storage space usage. Experimental results on the six widely-used datasets show that our proposed pruning method significantly outperforms baseline pruning methods. In addition, we demonstrate that our method preserves performance even in an unseen domain setting.
Multi-View Zero-Shot Open Intent Induction from Dialogues: Multi Domain Batch and Proxy Gradient Transfer
Hyukhun Koh | Haesung Pyun | Nakyeong Yang | Kyomin Jung
Proceedings of the Eleventh Dialog System Technology Challenge
Hyukhun Koh | Haesung Pyun | Nakyeong Yang | Kyomin Jung
Proceedings of the Eleventh Dialog System Technology Challenge
In Task Oriented Dialogue (TOD) system, detecting and inducing new intents are two main challenges to apply the system in the real world. In this paper, we suggest the semantic multiview model to resolve these two challenges: (1) SBERT for General Embedding (GE), (2) Multi Domain Batch (MDB) for dialogue domain knowledge, and (3) Proxy Gradient Transfer (PGT) for cluster-specialized semantic. MDB feeds diverse dialogue datasets to the model at once to tackle the multi-domain problem by learning the multiple domain knowledge. We introduce a novel method PGT, which employs the Siamese network to fine-tune the model with a clustering method directly. Our model can learn how to cluster dialogue utterances by using PGT. Experimental results demonstrate that our multi-view model with MDB and PGT significantly improves the Open Intent Induction performance compared to baseline systems.