Assistant-Guided Mitigation of Teacher Preference Bias in LLM-as-a-Judge

Zhuo Liu; Moxin Li; Xun Deng; Qifan Wang; Fuli Feng

doi:10.18653/v1/2025.findings-emnlp.510

Assistant-Guided Mitigation of Teacher Preference Bias in LLM-as-a-Judge

Zhuo Liu, Moxin Li, Xun Deng, Qifan Wang, Fuli Feng

Abstract

LLM-as-a-Judge employs large language models (LLMs), such as GPT-4, to evaluate the quality of LLM-generated responses, gaining popularity for its cost-effectiveness and strong alignment with human evaluations. However, training proxy judge models using evaluation data generated by powerful teacher models introduces a critical yet previously overlooked issue: teacher preference bias, where the proxy judge model learns a biased preference for responses from the teacher model. To tackle this problem, we propose a novel setting that incorporates an additional assistant model, which is not biased toward the teacher model’s responses, to complement the training data. Building on this setup, we introduce AGDe-Judge, a three-stage framework designed to debias from both the labels and feedbacks in the training data. Extensive experiments demonstrate that AGDe-Judge effectively reduces teacher preference bias while maintaining strong performance across six evaluation benchmarks. .

Anthology ID:: 2025.findings-emnlp.510
Volume:: Findings of the Association for Computational Linguistics: EMNLP 2025
Month:: November
Year:: 2025
Address:: Suzhou, China
Editors:: Christos Christodoulopoulos, Tanmoy Chakraborty, Carolyn Rose, Violet Peng
Venue:: Findings
SIG:
Publisher:: Association for Computational Linguistics
Note:
Pages:: 9610–9631
Language:
URL:: https://preview.aclanthology.org/author-page-yu-wang-polytechnic/2025.findings-emnlp.510/
DOI:: 10.18653/v1/2025.findings-emnlp.510
Bibkey:
Cite (ACL):: Zhuo Liu, Moxin Li, Xun Deng, Qifan Wang, and Fuli Feng. 2025. Assistant-Guided Mitigation of Teacher Preference Bias in LLM-as-a-Judge. In Findings of the Association for Computational Linguistics: EMNLP 2025, pages 9610–9631, Suzhou, China. Association for Computational Linguistics.
Cite (Informal):: Assistant-Guided Mitigation of Teacher Preference Bias in LLM-as-a-Judge (Liu et al., Findings 2025)
Copy Citation:
PDF:: https://preview.aclanthology.org/author-page-yu-wang-polytechnic/2025.findings-emnlp.510.pdf
Checklist:: 2025.findings-emnlp.510.checklist.pdf

PDF Cite Search Checklist Fix data