Self-Ensemble: Mitigating Confidence Distortion for Large Language Models

Zicheng Xu; Guanchu Wang; Guangyao Zheng; Yu-Neng Chuang; Alex Szalay; Xia Hu; Vladimir Braverman

doi:10.18653/v1/2025.findings-emnlp.902

Self-Ensemble: Mitigating Confidence Distortion for Large Language Models

Zicheng Xu, Guanchu Wang, Guangyao Zheng, Yu-Neng Chuang, Alex Szalay, Xia Hu, Vladimir Braverman

Abstract

Although Large Language Models (LLMs) perform well in general fields, they exhibit a **confidence distortion problem** on multi-choice question-answering (MCQA), particularly as the number of answer choices increases. Specifically, on MCQA with many choices, LLMs suffer from under-confidence in correct predictions and over-confidence in incorrect ones, leading to a substantially degraded performance. To solve this problem, we propose Self-Ensemble in this work. Our method splits the choices into several groups and ensembles LLM predictions across these groups to reach a final decision. The advantage of Self-Ensemble is its plug-and-play nature, where it can be integrated into existing LLM architecture based on a designed attention mask and positional encoding, without requiring labeled datasets for parameter tuning. Experimental results on three LLMs and datasets demonstrate that Self-Ensemble comprehensively addresses the confidence distortion problem of LLMs, outperforming standard inference as well as baseline methods.

Anthology ID:: 2025.findings-emnlp.902
Volume:: Findings of the Association for Computational Linguistics: EMNLP 2025
Month:: November
Year:: 2025
Address:: Suzhou, China
Editors:: Christos Christodoulopoulos, Tanmoy Chakraborty, Carolyn Rose, Violet Peng
Venue:: Findings
SIG:
Publisher:: Association for Computational Linguistics
Note:
Pages:: 16603–16615
Language:
URL:: https://preview.aclanthology.org/author-page-yu-wang-polytechnic/2025.findings-emnlp.902/
DOI:: 10.18653/v1/2025.findings-emnlp.902
Bibkey:
Cite (ACL):: Zicheng Xu, Guanchu Wang, Guangyao Zheng, Yu-Neng Chuang, Alex Szalay, Xia Hu, and Vladimir Braverman. 2025. Self-Ensemble: Mitigating Confidence Distortion for Large Language Models. In Findings of the Association for Computational Linguistics: EMNLP 2025, pages 16603–16615, Suzhou, China. Association for Computational Linguistics.
Cite (Informal):: Self-Ensemble: Mitigating Confidence Distortion for Large Language Models (Xu et al., Findings 2025)
Copy Citation:
PDF:: https://preview.aclanthology.org/author-page-yu-wang-polytechnic/2025.findings-emnlp.902.pdf
Checklist:: 2025.findings-emnlp.902.checklist.pdf

PDF Cite Search Checklist Fix data