Lost in Overlap: Exploring Logit-based Watermark Collision in LLMs

Yiyang Luo, Ke Lin, Chao Gu, Jiahui Hou, Lijie Wen, Luo Ping


Abstract
The proliferation of large language models (LLMs) in generating content raises concerns about text copyright. Watermarking methods, particularly logit-based approaches, embed imperceptible identifiers into text to address these challenges. However, the widespread usage of watermarking across diverse LLMs has led to an inevitable issue known as watermark collision during common tasks, such as paraphrasing or translation.In this paper, we introduce watermark collision as a novel and general philosophy for watermark attacks, aimed at enhancing attack performance on top of any other attacking methods. We also provide a comprehensive demonstration that watermark collision poses a threat to all logit-based watermark algorithms, impacting not only specific attack scenarios but also downstream applications.
Anthology ID:
2025.findings-naacl.37
Volume:
Findings of the Association for Computational Linguistics: NAACL 2025
Month:
April
Year:
2025
Address:
Albuquerque, New Mexico
Editors:
Luis Chiruzzo, Alan Ritter, Lu Wang
Venue:
Findings
SIG:
Publisher:
Association for Computational Linguistics
Note:
Pages:
620–637
Language:
URL:
https://preview.aclanthology.org/moar-dois/2025.findings-naacl.37/
DOI:
10.18653/v1/2025.findings-naacl.37
Bibkey:
Cite (ACL):
Yiyang Luo, Ke Lin, Chao Gu, Jiahui Hou, Lijie Wen, and Luo Ping. 2025. Lost in Overlap: Exploring Logit-based Watermark Collision in LLMs. In Findings of the Association for Computational Linguistics: NAACL 2025, pages 620–637, Albuquerque, New Mexico. Association for Computational Linguistics.
Cite (Informal):
Lost in Overlap: Exploring Logit-based Watermark Collision in LLMs (Luo et al., Findings 2025)
Copy Citation:
PDF:
https://preview.aclanthology.org/moar-dois/2025.findings-naacl.37.pdf