DebUnc: Improving Large Language Model Agent Communication With Uncertainty Metrics

Luke Yoffe, Alfonso Amayuelas, William Yang Wang


Abstract
Multi-agent debates have been introduced to improve the accuracy of Large Language Models (LLMs) by having multiple agents discuss solutions to a problem over several rounds of debate. However, models often generate incorrect yet confident-sounding responses, which can mislead the others. This issue arises partly because agents do not consider how confident their peers are. To address this, we propose DebUnc, a debate framework that uses uncertainty metrics to assess agent confidence. Confidence is then conveyed through textual prompts or via a modified attention mechanism that adjusts token weights. Evaluations across benchmarks show that attention-based methods are particularly effective and that performance continues to improve as uncertainty estimation becomes more reliable. The code is available at https://github.com/lukeyoffe/debunc.
Anthology ID:
2025.findings-emnlp.1265
Volume:
Findings of the Association for Computational Linguistics: EMNLP 2025
Month:
November
Year:
2025
Address:
Suzhou, China
Editors:
Christos Christodoulopoulos, Tanmoy Chakraborty, Carolyn Rose, Violet Peng
Venue:
Findings
SIG:
Publisher:
Association for Computational Linguistics
Note:
Pages:
23299–23315
Language:
URL:
https://preview.aclanthology.org/author-page-yu-wang-polytechnic/2025.findings-emnlp.1265/
DOI:
10.18653/v1/2025.findings-emnlp.1265
Bibkey:
Cite (ACL):
Luke Yoffe, Alfonso Amayuelas, and William Yang Wang. 2025. DebUnc: Improving Large Language Model Agent Communication With Uncertainty Metrics. In Findings of the Association for Computational Linguistics: EMNLP 2025, pages 23299–23315, Suzhou, China. Association for Computational Linguistics.
Cite (Informal):
DebUnc: Improving Large Language Model Agent Communication With Uncertainty Metrics (Yoffe et al., Findings 2025)
Copy Citation:
PDF:
https://preview.aclanthology.org/author-page-yu-wang-polytechnic/2025.findings-emnlp.1265.pdf
Checklist:
 2025.findings-emnlp.1265.checklist.pdf