Alex Szalay
2025
Self-Ensemble: Mitigating Confidence Distortion for Large Language Models
Zicheng Xu
|
Guanchu Wang
|
Guangyao Zheng
|
Yu-Neng Chuang
|
Alex Szalay
|
Xia Hu
|
Vladimir Braverman
Findings of the Association for Computational Linguistics: EMNLP 2025
Although Large Language Models (LLMs) perform well in general fields, they exhibit a **confidence distortion problem** on multi-choice question-answering (MCQA), particularly as the number of answer choices increases. Specifically, on MCQA with many choices, LLMs suffer from under-confidence in correct predictions and over-confidence in incorrect ones, leading to a substantially degraded performance. To solve this problem, we propose Self-Ensemble in this work. Our method splits the choices into several groups and ensembles LLM predictions across these groups to reach a final decision. The advantage of Self-Ensemble is its plug-and-play nature, where it can be integrated into existing LLM architecture based on a designed attention mask and positional encoding, without requiring labeled datasets for parameter tuning. Experimental results on three LLMs and datasets demonstrate that Self-Ensemble comprehensively addresses the confidence distortion problem of LLMs, outperforming standard inference as well as baseline methods.
Search
Fix author
Co-authors
- Vladimir Braverman 1
- Yu-Neng Chuang 1
- Xia Hu 1
- Guanchu Wang 1
- Zicheng Xu 1
- show all...