Choonghyun Park


2024

pdf
Aligning Language Models to Explicitly Handle Ambiguity
Hyuhng Joon Kim | Youna Kim | Cheonbok Park | Junyeob Kim | Choonghyun Park | Kang Min Yoo | Sang-goo Lee | Taeuk Kim
Proceedings of the 2024 Conference on Empirical Methods in Natural Language Processing

In interactions between users and language model agents, user utterances frequently exhibit ellipsis (omission of words or phrases) or imprecision (lack of exactness) to prioritize efficiency. This can lead to varying interpretations of the same input based on different assumptions or background knowledge. It is thus crucial for agents to adeptly handle the inherent ambiguity in queries to ensure reliability. However, even state-of-the-art large language models (LLMs) still face challenges in such scenarios, primarily due to the following hurdles: (1) LLMs are not explicitly trained to deal with ambiguous utterances; (2) the degree of ambiguity perceived by the LLMs may vary depending on the possessed knowledge. To address these issues, we propose Alignment with Perceived Ambiguity (APA), a novel pipeline that aligns LLMs to manage ambiguous queries by leveraging their own assessment of ambiguity (i.e., perceived ambiguity). Experimental results on question-answering datasets demonstrate that APA empowers LLMs to explicitly detect and manage ambiguous queries while retaining the ability to answer clear questions. Furthermore, our finding proves that APA excels beyond training with gold-standard labels, especially in out-of-distribution scenarios. The data and code are available at https://github.com/heyjoonkim/APA.

pdf
Adaptive Contrastive Decoding in Retrieval-Augmented Generation for Handling Noisy Contexts
Youna Kim | Hyuhng Joon Kim | Cheonbok Park | Choonghyun Park | Hyunsoo Cho | Junyeob Kim | Kang Min Yoo | Sang-goo Lee | Taeuk Kim
Findings of the Association for Computational Linguistics: EMNLP 2024

When using large language models (LLMs) in knowledge-intensive tasks, such as open-domain question answering, external context can bridge the gap between external knowledge and the LLMs’ parametric knowledge.Recent research has been developed to amplify contextual knowledge over the parametric knowledge of LLMs with contrastive decoding approaches.While these approaches could yield truthful responses when relevant context is provided, they are prone to vulnerabilities when faced with noisy contexts.We extend the scope of previous studies to encompass noisy contexts and propose adaptive contrastive decoding (ACD) to leverage contextual influence effectively.ACD demonstrates improvements in open-domain question answering tasks compared to baselines, especially in robustness by remaining undistracted by noisy contexts in retrieval-augmented generation.

2023

pdf
Universal Domain Adaptation for Robust Handling of Distributional Shifts in NLP
Hyuhng Kim | Hyunsoo Cho | Sang-Woo Lee | Junyeob Kim | Choonghyun Park | Sang-goo Lee | Kang Yoo | Taeuk Kim
Findings of the Association for Computational Linguistics: EMNLP 2023

When deploying machine learning systems to the wild, it is highly desirable for them to effectively leverage prior knowledge to the unfamiliar domain while also firing alarms to anomalous inputs. In order to address these requirements, Universal Domain Adaptation (UniDA) has emerged as a novel research area in computer vision, focusing on achieving both adaptation ability and robustness (i.e., the ability to detect out-of-distribution samples). While UniDA has led significant progress in computer vision, its application on language input still needs to be explored despite its feasibility. In this paper, we propose a comprehensive benchmark for natural language that offers thorough viewpoints of the model’s generalizability and robustness. Our benchmark encompasses multiple datasets with varying difficulty levels and characteristics, including temporal shifts and diverse domains. On top of our testbed, we validate existing UniDA methods from computer vision and state-of-the-art domain adaptation techniques from NLP literature, yielding valuable findings: We observe that UniDA methods originally designed for image input can be effectively transferred to the natural language domain while also underscoring the effect of adaptation difficulty in determining the model’s performance.

pdf
Probing Out-of-Distribution Robustness of Language Models with Parameter-Efficient Transfer Learning
Hyunsoo Cho | Choonghyun Park | Junyeob Kim | Hyuhng Joon Kim | Kang Min Yoo | Sang-goo Lee
Proceedings of the 12th Joint Conference on Lexical and Computational Semantics (*SEM 2023)

As the size of the pre-trained language model (PLM) continues to increase, numerous parameter-efficient transfer learning methods have been proposed recently to compensate for the high cost of fine-tuning. While large PLMs and various PETL methods have achieved impressive results on various benchmarks, it is uncertain whether they can effectively handle inputs that have been distributionally shifted. In this study, we systematically explore how the ability to detect out-of-distribution (OOD) changes as the size of the PLM grows or the transfer methods are altered. Specifically, we evaluated various PETL techniques, including fine-tuning, Adapter, LoRA, and prefix-tuning, with various language models with different scales.

2022

pdf
Enhancing Out-of-Distribution Detection in Natural Language Understanding via Implicit Layer Ensemble
Hyunsoo Cho | Choonghyun Park | Jaewook Kang | Kang Min Yoo | Taeuk Kim | Sang-goo Lee
Findings of the Association for Computational Linguistics: EMNLP 2022

Out-of-distribution (OOD) detection aims to discern outliers from the intended data distribution, which is crucial to maintaining high reliability and a good user experience.Most recent studies in OOD detection utilize the information from a single representation that resides in the penultimate layer to determine whether the input is anomalous or not.Although such a method is straightforward, the potential of diverse information in the intermediate layers is overlooked.In this paper, we propose a novel framework based on contrastive learning that encourages intermediate features to learn layer-specialized representations and assembles them implicitly into a single representation to absorb rich information in the pre-trained language model. Extensive experiments in various intent classification and OOD datasets demonstrate that our approach is significantly more effective than other works.