Watermarking LLMs with Weight Quantization

Linyang Li, Botian Jiang, Pengyu Wang, Ke Ren, Hang Yan, Xipeng Qiu


Abstract
Abuse of large language models reveals high risks as large language models are being deployed at an astonishing speed. It is important to protect the model weights to avoid malicious usage that violates licenses of open-source large language models. This paper proposes a novel watermarking strategy that plants watermarks in the quantization process of large language models without pre-defined triggers during inference. The watermark works when the model is used in the fp32 mode and remains hidden when the model is quantized to int8, in this way, the users can only inference the model without further supervised fine-tuning of the model. We successfully plant the watermark into open-source large language model weights including GPT-Neo and LLaMA. We hope our proposed method can provide a potential direction for protecting model weights in the era of large language model applications.
Anthology ID:
2023.findings-emnlp.220
Volume:
Findings of the Association for Computational Linguistics: EMNLP 2023
Month:
December
Year:
2023
Address:
Singapore
Editors:
Houda Bouamor, Juan Pino, Kalika Bali
Venue:
Findings
SIG:
Publisher:
Association for Computational Linguistics
Note:
Pages:
3368–3378
Language:
URL:
https://aclanthology.org/2023.findings-emnlp.220
DOI:
10.18653/v1/2023.findings-emnlp.220
Bibkey:
Cite (ACL):
Linyang Li, Botian Jiang, Pengyu Wang, Ke Ren, Hang Yan, and Xipeng Qiu. 2023. Watermarking LLMs with Weight Quantization. In Findings of the Association for Computational Linguistics: EMNLP 2023, pages 3368–3378, Singapore. Association for Computational Linguistics.
Cite (Informal):
Watermarking LLMs with Weight Quantization (Li et al., Findings 2023)
Copy Citation:
PDF:
https://preview.aclanthology.org/naacl24-info/2023.findings-emnlp.220.pdf