Low-Entropy Watermark Detection via Bayes’ Rule Derived Detector

Beining Huang, Du Su, Fei Sun, Qi Cao, Huawei Shen, Xueqi Cheng


Abstract
Text watermarking, which modify tokens to embed watermark, has proven effective in detecting machine-generated texts. Yet its application to low-entropy texts like code and mathematics presents significant challenges. A fair number of tokens in these texts are hardly modifiable without changing the intended meaning, causing statistical measures to falsely indicate the absence of a watermark. Existing research addresses this issue by rely mainly on a limited number of high-entropy tokens, which are considered flexible for modification, and accurately reflecting watermarks. However, their detection accuracy remains suboptimal, as they neglect strong watermark evidences embedded in low entropy tokens modified through watermarking. To overcome this limitation, we introduce Bayes’ Rule derived Watermark Detector (BRWD), which exploit watermark information from every token, by leveraging the posterior probability of watermark’s presence. We theoretically prove the optimality of our method in terms of detection accuracy, and demonstrate its superiority across various datasets, models, and watermark injection strategies. Notably, our method achieves up to 50% and 70% relative improvements in detection accuracy over the best baselines in code generation and math problem-solving tasks, respectively. Our code is available at https://github.com/cczslp/BRWD.
Anthology ID:
2025.findings-acl.739
Volume:
Findings of the Association for Computational Linguistics: ACL 2025
Month:
July
Year:
2025
Address:
Vienna, Austria
Editors:
Wanxiang Che, Joyce Nabende, Ekaterina Shutova, Mohammad Taher Pilehvar
Venue:
Findings
SIG:
Publisher:
Association for Computational Linguistics
Note:
Pages:
14330–14344
Language:
URL:
https://preview.aclanthology.org/display_plenaries/2025.findings-acl.739/
DOI:
Bibkey:
Cite (ACL):
Beining Huang, Du Su, Fei Sun, Qi Cao, Huawei Shen, and Xueqi Cheng. 2025. Low-Entropy Watermark Detection via Bayes’ Rule Derived Detector. In Findings of the Association for Computational Linguistics: ACL 2025, pages 14330–14344, Vienna, Austria. Association for Computational Linguistics.
Cite (Informal):
Low-Entropy Watermark Detection via Bayes’ Rule Derived Detector (Huang et al., Findings 2025)
Copy Citation:
PDF:
https://preview.aclanthology.org/display_plenaries/2025.findings-acl.739.pdf