CodeRipple: Wavelet-Based Detection of LLM-Generated Code

Xingyu Yao; Zhendong Mao; Quan Wang

CodeRipple: Wavelet-Based Detection of LLM-Generated Code

Abstract

Detecting LLM-generated code is crucial for ensuring software provenance, security, reliability, and licensing compliance. Existing training-free detectors, mostly adapted from text-based methods, rely on global statistics of the Token Perplexity Sequence (TPS) and struggle with code. We reveal a key insight: despite the convergence of global statistics, LLM-generated and human-written code differ fundamentally in their local TPS dynamics: the former shows narrow transient spikes while the latter exhibits broad sustained fluctuations. To capture this distinction, we introduce CodeRipple, a novel training-free detection framework that employs wavelet analysis to characterize TPS morphology across scales. It jointly leverages the Stationary Wavelet Transform to model fluctuation shape and the Discrete Wavelet Transform to quantify cross-scale energy distribution. Evaluated on three challenging benchmarks spanning diverse programming languages, multiple generating LLMs, and various evasion strategies, CodeRipple consistently outperforms existing training-free methods, demonstrating its superior effectiveness and generalizability without any model training. Code available at: https://github.com/yaoxingyu77/CodeRipple.

Anthology ID:: 2026.acl-long.1777
Volume:: Proceedings of the 64th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers)
Month:: July
Year:: 2026
Address:: San Diego, California, United States
Editors:: Maria Liakata, Viviane P. Moreira, Jiajun Zhang, David Jurgens
Venue:: ACL
SIG:
Publisher:: Association for Computational Linguistics
Note:
Pages:: 38351–38364
Language:
URL:: https://preview.aclanthology.org/ingest-acl/2026.acl-long.1777/
DOI:
Bibkey:
Cite (ACL):: Xingyu Yao, Zhendong Mao, and Quan Wang. 2026. CodeRipple: Wavelet-Based Detection of LLM-Generated Code. In Proceedings of the 64th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), pages 38351–38364, San Diego, California, United States. Association for Computational Linguistics.
Cite (Informal):: CodeRipple: Wavelet-Based Detection of LLM-Generated Code (Yao et al., ACL 2026)
Copy Citation:
PDF:: https://preview.aclanthology.org/ingest-acl/2026.acl-long.1777.pdf
Checklist:: 2026.acl-long.1777.checklist.pdf

PDF Cite Search Checklist Fix data