Luo Ping
2025
Lost in Overlap: Exploring Logit-based Watermark Collision in LLMs
Yiyang Luo
|
Ke Lin
|
Chao Gu
|
Jiahui Hou
|
Lijie Wen
|
Luo Ping
Findings of the Association for Computational Linguistics: NAACL 2025
The proliferation of large language models (LLMs) in generating content raises concerns about text copyright. Watermarking methods, particularly logit-based approaches, embed imperceptible identifiers into text to address these challenges. However, the widespread usage of watermarking across diverse LLMs has led to an inevitable issue known as watermark collision during common tasks, such as paraphrasing or translation.In this paper, we introduce watermark collision as a novel and general philosophy for watermark attacks, aimed at enhancing attack performance on top of any other attacking methods. We also provide a comprehensive demonstration that watermark collision poses a threat to all logit-based watermark algorithms, impacting not only specific attack scenarios but also downstream applications.
2024
Zero-shot Generative Linguistic Steganography
Ke Lin
|
Yiyang Luo
|
Zijian Zhang
|
Luo Ping
Proceedings of the 2024 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies (Volume 1: Long Papers)
Generative linguistic steganography attempts to hide secret messages into covertext. Previous studies have generally focused on the statistical differences between the covertext and stegotext, however, ill-formed stegotext can readily be identified by humans. In this paper, we propose a novel zero-shot approach based on in-context learning for linguistic steganography to achieve better perceptual and statistical imperceptibility. We also design several new metrics and reproducible language evaluations to measure the imperceptibility of the stegotext. Our experimental results indicate that our method produces 1.926× more innocent and intelligible stegotext than any other method.