The Feasibility of Topic-Based Watermarking on Academic Peer Reviews

Alexander Nemecek, Yuzhou Jiang, Erman Ayday


Abstract
Large language models (LLMs) are increasingly integrated into academic workflows, with many conferences and journals permitting their use for tasks such as language refinement and literature summarization. However, their use in peer review remains prohibited due to concerns around confidentiality breaches, hallucinated content, and inconsistent evaluations. As LLM-generated text becomes more indistinguishable from human writing, there is a growing need for reliable attribution mechanisms to preserve the integrity of the review process. In this work, we evaluate topic-based watermarking (TBW), a semantic-aware technique designed to embed detectable signals into LLM-generated text. We conduct a systematic assessment across multiple LLM configurations, including base, few-shot, and fine-tuned variants, using authentic peer review data from academic conferences. Our results show that TBW maintains review quality relative to non-watermarked outputs, while demonstrating robust detection performance under paraphrasing. These findings highlight the viability of TBW as a minimally intrusive and practical solution for LLM attribution in peer review settings.
Anthology ID:
2025.findings-ijcnlp.36
Volume:
Proceedings of the 14th International Joint Conference on Natural Language Processing and the 4th Conference of the Asia-Pacific Chapter of the Association for Computational Linguistics
Month:
December
Year:
2025
Address:
Mumbai, India
Editors:
Kentaro Inui, Sakriani Sakti, Haofen Wang, Derek F. Wong, Pushpak Bhattacharyya, Biplab Banerjee, Asif Ekbal, Tanmoy Chakraborty, Dhirendra Pratap Singh
Venue:
Findings
SIG:
Publisher:
The Asian Federation of Natural Language Processing and The Association for Computational Linguistics
Note:
Pages:
616–634
Language:
URL:
https://preview.aclanthology.org/ingest-ijcnlp-aacl/2025.findings-ijcnlp.36/
DOI:
Bibkey:
Cite (ACL):
Alexander Nemecek, Yuzhou Jiang, and Erman Ayday. 2025. The Feasibility of Topic-Based Watermarking on Academic Peer Reviews. In Proceedings of the 14th International Joint Conference on Natural Language Processing and the 4th Conference of the Asia-Pacific Chapter of the Association for Computational Linguistics, pages 616–634, Mumbai, India. The Asian Federation of Natural Language Processing and The Association for Computational Linguistics.
Cite (Informal):
The Feasibility of Topic-Based Watermarking on Academic Peer Reviews (Nemecek et al., Findings 2025)
Copy Citation:
PDF:
https://preview.aclanthology.org/ingest-ijcnlp-aacl/2025.findings-ijcnlp.36.pdf