Watermark under Fire: A Robustness Evaluation of LLM Watermarking

Jiacheng Liang, Zian Wang, Spencer Hong, Shouling Ji, Ting Wang


Abstract
Various watermarking methods (“watermarkers”) have been proposed to identify LLM-generated texts; yet, due to the lack of unified evaluation platforms, many critical questions remain under-explored: i) What are the strengths/limitations of various watermarkers, especially their attack robustness? ii) How do various design choices impact their robustness? iii) How to optimally operate watermarkers in adversarial environments? To fill this gap, we systematize existing LLM watermarkers and watermark removal attacks, mapping out their design spaces. We then develop WaterPark, a unified platform that integrates 10 state-of-the-art watermarkers and 12 representative attacks. More importantly, by leveraging WaterPark, we conduct a comprehensive assessment of existing watermarkers, unveiling the impact of various design choices on their attack robustness. We further explore the best practices to operate watermarkers in adversarial environments. We believe our study sheds light on current LLM watermarking techniques while WaterPark serves as a valuable testbed to facilitate future research.
Anthology ID:
2025.findings-emnlp.1148
Volume:
Findings of the Association for Computational Linguistics: EMNLP 2025
Month:
November
Year:
2025
Address:
Suzhou, China
Editors:
Christos Christodoulopoulos, Tanmoy Chakraborty, Carolyn Rose, Violet Peng
Venue:
Findings
SIG:
Publisher:
Association for Computational Linguistics
Note:
Pages:
21050–21074
Language:
URL:
https://preview.aclanthology.org/name-variant-enfa-fane/2025.findings-emnlp.1148/
DOI:
10.18653/v1/2025.findings-emnlp.1148
Bibkey:
Cite (ACL):
Jiacheng Liang, Zian Wang, Spencer Hong, Shouling Ji, and Ting Wang. 2025. Watermark under Fire: A Robustness Evaluation of LLM Watermarking. In Findings of the Association for Computational Linguistics: EMNLP 2025, pages 21050–21074, Suzhou, China. Association for Computational Linguistics.
Cite (Informal):
Watermark under Fire: A Robustness Evaluation of LLM Watermarking (Liang et al., Findings 2025)
Copy Citation:
PDF:
https://preview.aclanthology.org/name-variant-enfa-fane/2025.findings-emnlp.1148.pdf
Checklist:
 2025.findings-emnlp.1148.checklist.pdf