Evaluating Uncertainty Quantification Methods in Argumentative Large Language Models

Kevin Zhou; Adam Dejl; Gabriel Freedman; Lihu Chen; Antonio Rago; Francesca Toni

doi:10.18653/v1/2025.findings-emnlp.1184

Evaluating Uncertainty Quantification Methods in Argumentative Large Language Models

Kevin Zhou, Adam Dejl, Gabriel Freedman, Lihu Chen, Antonio Rago, Francesca Toni

Abstract

Research in uncertainty quantification (UQ) for large language models (LLMs) is increasingly important towards guaranteeing the reliability of this groundbreaking technology. We explore the integration of LLM UQ methods in argumentative LLMs (ArgLLMs), an explainable LLM framework for decision-making based on computational argumentation in which UQ plays a critical role. We conduct experiments to evaluate ArgLLMs’ performance on claim verification tasks when using different LLM UQ methods, inherently performing an assessment of the UQ methods’ effectiveness. Moreover, the experimental procedure itself is a novel way of evaluating the effectiveness of UQ methods, especially when intricate and potentially contentious statements are present. Our results demonstrate that, despite its simplicity, direct prompting is an effective UQ strategy in ArgLLMs, outperforming considerably more complex approaches.

Anthology ID:: 2025.findings-emnlp.1184
Volume:: Findings of the Association for Computational Linguistics: EMNLP 2025
Month:: November
Year:: 2025
Address:: Suzhou, China
Editors:: Christos Christodoulopoulos, Tanmoy Chakraborty, Carolyn Rose, Violet Peng
Venue:: Findings
SIG:
Publisher:: Association for Computational Linguistics
Note:
Pages:: 21700–21711
Language:
URL:: https://preview.aclanthology.org/name-variant-enfa-fane/2025.findings-emnlp.1184/
DOI:: 10.18653/v1/2025.findings-emnlp.1184
Bibkey:
Cite (ACL):: Kevin Zhou, Adam Dejl, Gabriel Freedman, Lihu Chen, Antonio Rago, and Francesca Toni. 2025. Evaluating Uncertainty Quantification Methods in Argumentative Large Language Models. In Findings of the Association for Computational Linguistics: EMNLP 2025, pages 21700–21711, Suzhou, China. Association for Computational Linguistics.
Cite (Informal):: Evaluating Uncertainty Quantification Methods in Argumentative Large Language Models (Zhou et al., Findings 2025)
Copy Citation:
PDF:: https://preview.aclanthology.org/name-variant-enfa-fane/2025.findings-emnlp.1184.pdf
Checklist:: 2025.findings-emnlp.1184.checklist.pdf

PDF Cite Search Checklist Fix data