Label Effects: Shared Heuristic Reliance in Trust Assessment by Humans and LLM-as-a-Judge

Xin Sun; Di Wu; Sijing Qin; Isao Echizen; Abdallah El Ali; Saku Sugawara

Label Effects: Shared Heuristic Reliance in Trust Assessment by Humans and LLM-as-a-Judge

Xin Sun, Di Wu, Sijing Qin, Isao Echizen, Abdallah El Ali, Saku Sugawara

Abstract

Large language models (LLMs) are increasingly used as automated evaluators (LLM-as-a-Judge). This work challenges its reliability by showing that trust judgments by LLMs are biased by disclosed source labels. Using a counterfactual design, we find that both humans and LLM judges assign higher trust to information labeled as human-authored than to the same content labeled as AI-generated. Eye-tracking data reveal that humans rely heavily on source labels as heuristic cues for judgments. We analyze LLM internal states during judgment. Across label conditions, models allocate denser attention to the label region than the content region, and this label dominance is stronger under Human labels than AI labels, consistent with the human gaze patterns. Besides, decision uncertainty measured by logits is higher under AI labels than Human labels. These results indicate that the source label is a salient heuristic cue for both humans and LLMs. It raises validity concerns for label-sensitive LLM-as-a-Judge evaluation, and we cautiously raise that aligning models with human preferences may propagate human heuristic reliance into models, motivating debiased evaluation and alignment.

Anthology ID:: 2026.acl-long.1495
Volume:: Proceedings of the 64th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers)
Month:: July
Year:: 2026
Address:: San Diego, California, United States
Editors:: Maria Liakata, Viviane P. Moreira, Jiajun Zhang, David Jurgens
Venue:: ACL
SIG:
Publisher:: Association for Computational Linguistics
Note:
Pages:: 32378–32392
Language:
URL:: https://preview.aclanthology.org/ingest-acl/2026.acl-long.1495/
DOI:
Bibkey:
Cite (ACL):: Xin Sun, Di Wu, Sijing Qin, Isao Echizen, Abdallah El Ali, and Saku Sugawara. 2026. Label Effects: Shared Heuristic Reliance in Trust Assessment by Humans and LLM-as-a-Judge. In Proceedings of the 64th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), pages 32378–32392, San Diego, California, United States. Association for Computational Linguistics.
Cite (Informal):: Label Effects: Shared Heuristic Reliance in Trust Assessment by Humans and LLM-as-a-Judge (Sun et al., ACL 2026)
Copy Citation:
PDF:: https://preview.aclanthology.org/ingest-acl/2026.acl-long.1495.pdf
Checklist:: 2026.acl-long.1495.checklist.pdf

PDF Cite Search Checklist Fix data