Cuicui Luo
2026
Leveraging Human and Machine Preferences for Zero-shot Detection of AI-Generated Text
Lei Jiang | Desheng Wu | Xiaolong Zheng | Cuicui Luo
Findings of the Association for Computational Linguistics: ACL 2026
Lei Jiang | Desheng Wu | Xiaolong Zheng | Cuicui Luo
Findings of the Association for Computational Linguistics: ACL 2026
In recent years, the rapid advancement of large language models (LLMs) has enabled generated texts to closely mimic human writing, posing significant challenges to the detection of AI-generated content. Current mainstream zero-shot detection methods largely adopt a machine-centric perspective, relying on proxy models to compute token-level AI-likelihood scores and treating all tokens equally during overall detection. However, such approaches overlook the prediction discrepancies that arise when humans and large language models interpret the same text. We argue that tokens exhibiting greater divergence between human and machine predictions can provide stronger clues for determining the authorship of a text. To address this limitation, we propose HAPDA—a human-machine prediction discrepancy adapter for AI-generated text detection (AGTD). The framework consists of two core components: (1) a joint fine-tuning strategy for training paired human-preference and machine-preference models, and (2) a discrepancy-aware reweighting mechanism designed to calibrate token-level detection scores in downstream detectors. Extensive experiments demonstrate that HAPDA consistently and significantly enhances the detection performance of five representative baseline models under various evaluation scenarios.