SELENE: Selective and Evidence-Weighted LLM Debating for Efficient and Reliable Reasoning

Akshay Verma; Swapnil Gupta; Deepak Gupta; Prateek Sircar; Siddharth Pillai

SELENE: Selective and Evidence-Weighted LLM Debating for Efficient and Reliable Reasoning

Akshay Verma, Swapnil Gupta, Deepak Gupta, Prateek Sircar, Siddharth Pillai

Abstract

Multi-Agent Debate (MAD) frameworks improve factual reliability in large language models (LLMs) by allowing agents to critiqueand refine one another’s reasoning. Yet, existing MAD systems are computationally expensive and prone to degradation under pro-longed debates due to redundant exchanges and unstable judging. We propose a lightweight,industry-deployable alternative that unifies Selective Debate Initiation (SDI) with Evidence Weighted Self-Consistency (EWSC) for adaptive, debate-on-demand reasoning. SDI dynamically predicts when debate is necessary by detecting confidence-likelihood misalignment and semantic disagreement, skippingwell-aligned queries to conserve computation. EWSC replaces a single-judge verdict with a variance-aware, evidence-weighted aggregation across paraphrased evaluations, yielding more stable factual judgments. Combined, SDI and EWSC reduce token consumption by nearly 50% while improving both accuracy and calibration. Evaluated on BoolQ, CosmosQA, and an internal QnA benchmark, our framework achieves higher factual robustness and efficiency, demonstrating that scalable, epistemically reliable multi-agent reasoning is practical for real-world LLM deployments.

Anthology ID:: 2026.eacl-industry.7
Volume:: Proceedings of the 19th Conference of the European Chapter of the Association for Computational Linguistics (Volume 5: Industry Track)
Month:: March
Year:: 2026
Address:: Rabat, Morocco
Editors:: Yevgen Matusevych, Gülşen Eryiğit, Nikolaos Aletras
Venue:: EACL
SIG:
Publisher:: Association for Computational Linguistics
Note:
Pages:: 95–104
Language:
URL:: https://preview.aclanthology.org/ingest-eacl/2026.eacl-industry.7/
DOI:
Bibkey:
Cite (ACL):: Akshay Verma, Swapnil Gupta, Deepak Gupta, Prateek Sircar, and Siddharth Pillai. 2026. SELENE: Selective and Evidence-Weighted LLM Debating for Efficient and Reliable Reasoning. In Proceedings of the 19th Conference of the European Chapter of the Association for Computational Linguistics (Volume 5: Industry Track), pages 95–104, Rabat, Morocco. Association for Computational Linguistics.
Cite (Informal):: SELENE: Selective and Evidence-Weighted LLM Debating for Efficient and Reliable Reasoning (Verma et al., EACL 2026)
Copy Citation:
PDF:: https://preview.aclanthology.org/ingest-eacl/2026.eacl-industry.7.pdf

PDF Cite Search Fix data