Advancing Beyond Identification: Multi-bit Watermark for Large Language Models

Kiyoon Yoo; Wonhyuk Ahn; Nojun Kwak

Advancing Beyond Identification: Multi-bit Watermark for Large Language Models

Abstract

We show the viability of tackling misuses of large language models beyond the identification of machine-generated text. While existing zero-bit watermark methods focus on detection only, some malicious misuses demand tracing the adversary user for counteracting them. To address this, we propose Multi-bit Watermark via Position Allocation, embedding traceable multi-bit information during language model generation. Through allocating tokens onto different parts of the messages, we embed longer messages in high corruption settings without added latency. By independently embedding sub-units of messages, the proposed method outperforms the existing works in terms of robustness and latency. Leveraging the benefits of zero-bit watermarking, our method enables robust extraction of the watermark without any model access, embedding and extraction of long messages (≥ 32-bit) without finetuning, and maintaining text quality, while allowing zero-bit detection all at the same time.

Anthology ID:: 2024.naacl-long.224
Volume:: Proceedings of the 2024 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies (Volume 1: Long Papers)
Month:: June
Year:: 2024
Address:: Mexico City, Mexico
Editors:: Kevin Duh, Helena Gomez, Steven Bethard
Venue:: NAACL
SIG:
Publisher:: Association for Computational Linguistics
Note:
Pages:: 4031–4055
Language:
URL:: https://aclanthology.org/2024.naacl-long.224
DOI:
Bibkey:
Cite (ACL):: KiYoon Yoo, Wonhyuk Ahn, and Nojun Kwak. 2024. Advancing Beyond Identification: Multi-bit Watermark for Large Language Models. In Proceedings of the 2024 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies (Volume 1: Long Papers), pages 4031–4055, Mexico City, Mexico. Association for Computational Linguistics.
Cite (Informal):: Advancing Beyond Identification: Multi-bit Watermark for Large Language Models (Yoo et al., NAACL 2024)
Copy Citation:
PDF:: https://preview.aclanthology.org/jeptaln-2024-ingestion/2024.naacl-long.224.pdf

PDF Search