Hakyung Lee
2025
Uncertainty-Aware Contrastive Decoding
Hakyung Lee
|
Subeen Park
|
Joowang Kim
|
Sungjun Lim
|
Kyungwoo Song
Findings of the Association for Computational Linguistics: ACL 2025
Large language models excel in a wide range of natural language processing tasks, but generating factually accurate and consistent outputs remains a challenge. To improve text reliability, Contrastive Decoding (CD) refines token selection by leveraging differences between an expert and base model, penalizing low-quality token choices. However, CD employs static weighting between models, making it sensitive to variations in model architecture and input characteristics, often resulting in suboptimal token selection and error propagation throughout generation. We propose Uncertainty-Aware Contrastive Decoding (UCD), a method that dynamically adjusts model contributions at each decoding step based on uncertainty. We introduce a cumulative energy function, where uncertainty is quantified as the negative log-sum-exp over logits, and decomposed into entropy and expected logit components. This energy serves as a dynamic confidence signal, guiding adaptive model weighting during generation. We demonstrate through extensive experiments that UCD significantly improves factual accuracy and reliability over existing decoding methods. Finally, we provide a theoretical analysis showing that our energy function serves as a well-defined uncertainty metric capturing model confidence. Our code is available at: https://github.com/MLAI-Yonsei/UCD.
2024
CED: Comparing Embedding Differences for Detecting Out-of-Distribution and Hallucinated Text
Hakyung Lee
|
Keon-Hee Park
|
Hoyoon Byun
|
Jeyoon Yeom
|
Jihee Kim
|
Gyeong-Moon Park
|
Kyungwoo Song
Findings of the Association for Computational Linguistics: EMNLP 2024
Detecting out-of-distribution (OOD) samples is crucial for ensuring the safety and robustness of models deployed in real-world scenarios. While most studies on OOD detection focus on fine-tuned models trained on in-distribution (ID) data, detecting OOD in pre-trained models is also important due to computational limitations and the widespread use of open-source pre-trained models. However, in the same domain shift setting, the OOD detection performance of pre-trained models is insufficient because both ID and OOD samples originate from the same domain, leading to a high overlap in their embeddings. To address this issue, we introduce a new method called CED, a training-free OOD detection technique designed to enhance the distinction between ID and OOD datasets. We theoretically validate that specific auxiliary and oracle samples that satisfy certain conditions improve this distinction. Motivated by our theoretical analysis, CED enhances the differentiation by utilizing these specially designed auxiliary and oracle samples. As a result, CED significantly improves the ability of pre-trained models to distinguish between ID and OOD samples in text classification and hallucination detection tasks. Furthermore, we verify that CED is a plug-and-play method compatible with various backbone networks, such as RoBERTa, Llama, and OpenAI Embedding.
Search
Fix author
Co-authors
- Kyungwoo Song 2
- Hoyoon Byun 1
- Jihee Kim 1
- Joowang Kim 1
- Sungjun Lim 1
- show all...