AdaJudge: Adaptive Multi-Perspective Judging for Reward Modeling

Yongliang Miao; Yangyang Liang; Mengnan Du

AdaJudge: Adaptive Multi-Perspective Judging for Reward Modeling

Yongliang Miao, Yangyang Liang, Mengnan Du

Abstract

Reward modeling is essential for aligning large language models with human preferences, yet predominant architectures rely on a static pooling strategy to condense sequences into scalar scores. This paradigm, however, suffers from two key limitations: a static inductive bias that misaligns with the task-dependent preference signals, and a representational mismatch, as the backbone’s optimization for generation leaves its representations ill-suited to fine-grained discrimination. To address this, we propose AdaJudge, a unified framework that jointly adapts representation and aggregation. AdaJudge first improves backbone representations into a discrimination-oriented space via gated refinement blocks. It then replaces the static readout with an adaptive multi-view pooling module, which dynamically routes and combines evidence. Extensive experiments on RM-Bench and JudgeBench show that AdaJudge outperforms strong off-the-shelf reward models and traditional pooling baselines.

Anthology ID:: 2026.acl-long.440
Volume:: Proceedings of the 64th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers)
Month:: July
Year:: 2026
Address:: San Diego, California, United States
Editors:: Maria Liakata, Viviane P. Moreira, Jiajun Zhang, David Jurgens
Venue:: ACL
SIG:
Publisher:: Association for Computational Linguistics
Note:
Pages:: 9712–9724
Language:
URL:: https://preview.aclanthology.org/ingest-acl/2026.acl-long.440/
DOI:
Bibkey:
Cite (ACL):: Yongliang Miao, Yangyang Liang, and Mengnan Du. 2026. AdaJudge: Adaptive Multi-Perspective Judging for Reward Modeling. In Proceedings of the 64th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), pages 9712–9724, San Diego, California, United States. Association for Computational Linguistics.
Cite (Informal):: AdaJudge: Adaptive Multi-Perspective Judging for Reward Modeling (Miao et al., ACL 2026)
Copy Citation:
PDF:: https://preview.aclanthology.org/ingest-acl/2026.acl-long.440.pdf
Checklist:: 2026.acl-long.440.checklist.pdf

PDF Cite Search Checklist Fix data