DivScore: Zero-Shot Detection of LLM-Generated Text in Specialized Domains

Zhihui Chen; Kai He; Yucheng Huang; Yunxiao Zhu; Mengling Feng

DivScore: Zero-Shot Detection of LLM-Generated Text in Specialized Domains

Zhihui Chen, Kai He, Yucheng Huang, Yunxiao Zhu, Mengling Feng

Abstract

Detecting LLM-generated text in specialized and high-stakes domains like medicine and law is crucial for combating misinformation and ensuring authenticity. However, current zero-shot detectors, while effective on general text, often fail when applied to specialized content due to domain shift. We provide a theoretical analysis showing this failure is fundamentally linked to the KL divergence between human, detector, and source text distributions. To address this, we propose DivScore, a zero-shot detection framework using normalized entropy-based scoring and domain knowledge distillation to robustly identify LLM-generated text in specialized domains. Experiments on medical and legal datasets show that DivScore consistently outperforms state-of-the-art detectors, with 14.4% higher AUROC and 64.0% higher recall at 0.1% false positive rate threshold. In adversarial settings, DivScore demonstrates superior robustness to other baselines, achieving on average 22.8% advantage in AUROC and 29.5% in recall.

Anthology ID:: 2025.emnlp-main.971
Volume:: Proceedings of the 2025 Conference on Empirical Methods in Natural Language Processing
Month:: November
Year:: 2025
Address:: Suzhou, China
Editors:: Christos Christodoulopoulos, Tanmoy Chakraborty, Carolyn Rose, Violet Peng
Venue:: EMNLP
SIG:
Publisher:: Association for Computational Linguistics
Note:
Pages:: 19242–19264
Language:
URL:: https://preview.aclanthology.org/ingest-emnlp/2025.emnlp-main.971/
DOI:
Bibkey:
Cite (ACL):: Zhihui Chen, Kai He, Yucheng Huang, Yunxiao Zhu, and Mengling Feng. 2025. DivScore: Zero-Shot Detection of LLM-Generated Text in Specialized Domains. In Proceedings of the 2025 Conference on Empirical Methods in Natural Language Processing, pages 19242–19264, Suzhou, China. Association for Computational Linguistics.
Cite (Informal):: DivScore: Zero-Shot Detection of LLM-Generated Text in Specialized Domains (Chen et al., EMNLP 2025)
Copy Citation:
PDF:: https://preview.aclanthology.org/ingest-emnlp/2025.emnlp-main.971.pdf
Checklist:: 2025.emnlp-main.971.checklist.pdf

PDF Cite Search Checklist Fix data