Enhancing LLM Text Detection with Retrieved Contexts and Logits Distribution Consistency

Zhaoheng Huang, Yutao Zhu, Ji-Rong Wen, Zhicheng Dou


Abstract
Large language models (LLMs) can generate fluent text, raising concerns about misuse in online comments and academic writing, leading to issues like corpus pollution and copyright infringement. Existing LLM text detection methods often rely on features from the logit distribution of the input text. However, the distinction between the LLM-generated and human-written texts may rely on only a few tokens due to the short length or insufficient information in some texts, leading to minimal and hard-to-detect differences in logit distributions. To address this, we propose HALO, an LLM-based detection method that leverages external text corpora to evaluate the difference in the logit distribution of input text under retrieved human-written and LLM-rewritten contexts. HALO also complements basic detection features and can serve as a plug-and-play module to enhance existing detection methods. Extensive experiments on five public datasets with three widely-used source LLMs show that our proposed detection method achieves state-of-the-art performance in AUROC, both in cross-domain and domain-specific scenarios.
Anthology ID:
2025.emnlp-main.503
Volume:
Proceedings of the 2025 Conference on Empirical Methods in Natural Language Processing
Month:
November
Year:
2025
Address:
Suzhou, China
Editors:
Christos Christodoulopoulos, Tanmoy Chakraborty, Carolyn Rose, Violet Peng
Venue:
EMNLP
SIG:
Publisher:
Association for Computational Linguistics
Note:
Pages:
9933–9945
Language:
URL:
https://preview.aclanthology.org/ingest-emnlp/2025.emnlp-main.503/
DOI:
Bibkey:
Cite (ACL):
Zhaoheng Huang, Yutao Zhu, Ji-Rong Wen, and Zhicheng Dou. 2025. Enhancing LLM Text Detection with Retrieved Contexts and Logits Distribution Consistency. In Proceedings of the 2025 Conference on Empirical Methods in Natural Language Processing, pages 9933–9945, Suzhou, China. Association for Computational Linguistics.
Cite (Informal):
Enhancing LLM Text Detection with Retrieved Contexts and Logits Distribution Consistency (Huang et al., EMNLP 2025)
Copy Citation:
PDF:
https://preview.aclanthology.org/ingest-emnlp/2025.emnlp-main.503.pdf
Checklist:
 2025.emnlp-main.503.checklist.pdf