Graders Should Cheat: Privileged Information Enables Expert-Level Automated Evaluations

Jin Peng Zhou; Séb Arnold; Nan Ding; Kilian Q Weinberger; Nan Hua; Fei Sha

Graders Should Cheat: Privileged Information Enables Expert-Level Automated Evaluations

Jin Peng Zhou, Séb Arnold, Nan Ding, Kilian Q Weinberger, Nan Hua, Fei Sha

Abstract

Auto-evaluating language models (LMs), *i.e*., using a grader LM to evaluate the candidate LM, is an appealing way to accelerate the evaluation process and the cost associated with it. But this presents a paradox: how can we trust the grader LM, which is presumably weaker than the candidate LM, to assess problems that are beyond the frontier of the capabilities of either model or both? For instance, today’s LMs struggle on graduate-level physics and Olympiad-level math, making them unreliable graders in these domains. We show that providing *privileged information* – such as ground-truth solutions or problem-specific guidelines – improves automated evaluations on such frontier problems. This approach offers two key advantages. First, it expands the range of problems where LMs graders apply. Specifically, weaker models can now rate the predictions of stronger models. Second, privileged information can be used to devise easier variations of challenging problems which improves the separability of different LMs on tasks where their performance is generally low. With this approach, general-purpose LM graders match the state of the art performance on *RewardBench*, surpassing almost all the specially-tuned models. LM graders also outperform individual human raters on *Vibe-Eval*, and approach human expert graders on Olympiad-level math problems.

Anthology ID:: 2025.emnlp-main.838
Volume:: Proceedings of the 2025 Conference on Empirical Methods in Natural Language Processing
Month:: November
Year:: 2025
Address:: Suzhou, China
Editors:: Christos Christodoulopoulos, Tanmoy Chakraborty, Carolyn Rose, Violet Peng
Venue:: EMNLP
SIG:
Publisher:: Association for Computational Linguistics
Note:
Pages:: 16583–16601
Language:
URL:: https://preview.aclanthology.org/ingest-emnlp/2025.emnlp-main.838/
DOI:
Bibkey:
Cite (ACL):: Jin Peng Zhou, Séb Arnold, Nan Ding, Kilian Q Weinberger, Nan Hua, and Fei Sha. 2025. Graders Should Cheat: Privileged Information Enables Expert-Level Automated Evaluations. In Proceedings of the 2025 Conference on Empirical Methods in Natural Language Processing, pages 16583–16601, Suzhou, China. Association for Computational Linguistics.
Cite (Informal):: Graders Should Cheat: Privileged Information Enables Expert-Level Automated Evaluations (Zhou et al., EMNLP 2025)
Copy Citation:
PDF:: https://preview.aclanthology.org/ingest-emnlp/2025.emnlp-main.838.pdf
Checklist:: 2025.emnlp-main.838.checklist.pdf

PDF Cite Search Checklist Fix data