Improving LLM-as-a-Judge Inference with the Judgment Distribution

Victor Wang, Michael JQ Zhang, Eunsol Choi


Abstract
Using language models to scalably approximate human preferences on text quality (LLM-as-a-judge) has become a standard practice applicable to many tasks. A judgment is often extracted from the judge’s textual output alone, typically with greedy decoding. However, LLM judges naturally provide distributions over judgment tokens, inviting a breadth of inference methods for extracting fine-grained preferences. We find that taking the mean of the judgment distribution consistently outperforms taking the mode (i.e. greedy decoding) in all evaluation settings (i.e. pointwise, pairwise, and listwise). We further explore novel methods of deriving preferences from judgment distributions, and find that methods incorporating risk aversion often improve performance. Lastly, we analyze LLM-as-a-judge paired with chain-of-thought (CoT) prompting, showing that CoT can collapse the spread of the judgment distribution, often harming performance. Our findings show that leveraging distributional output improves LLM-as-a-judge, as opposed to using the text interface alone.
Anthology ID:
2025.findings-emnlp.1259
Volume:
Findings of the Association for Computational Linguistics: EMNLP 2025
Month:
November
Year:
2025
Address:
Suzhou, China
Editors:
Christos Christodoulopoulos, Tanmoy Chakraborty, Carolyn Rose, Violet Peng
Venue:
Findings
SIG:
Publisher:
Association for Computational Linguistics
Note:
Pages:
23173–23199
Language:
URL:
https://preview.aclanthology.org/author-page-yu-wang-polytechnic/2025.findings-emnlp.1259/
DOI:
10.18653/v1/2025.findings-emnlp.1259
Bibkey:
Cite (ACL):
Victor Wang, Michael JQ Zhang, and Eunsol Choi. 2025. Improving LLM-as-a-Judge Inference with the Judgment Distribution. In Findings of the Association for Computational Linguistics: EMNLP 2025, pages 23173–23199, Suzhou, China. Association for Computational Linguistics.
Cite (Informal):
Improving LLM-as-a-Judge Inference with the Judgment Distribution (Wang et al., Findings 2025)
Copy Citation:
PDF:
https://preview.aclanthology.org/author-page-yu-wang-polytechnic/2025.findings-emnlp.1259.pdf
Checklist:
 2025.findings-emnlp.1259.checklist.pdf