RShield: A User-level Traceable Backdoor Watermark for LLMs in Embedding-as-a-Service
Lingyun Xiang, Yufan Zhong, Chengfu Ou, Zhihua Xia, Chunfang Yang, Daojian Zeng, Zhangjie Fu
Abstract
Embedding-as-a-Service (EaaS) has emerged as a critical paradigm for commercializing large language models (LLMs). However, existing backdoor watermarking techniques are fundamentally limited to "zero-bit" detection, which prevents user-level traceability in multi-user EaaS scenarios. To address these limitations, we propose RShield, a multi-bit backdoor watermarking that enables reliable user-level attribution of LLMs for EaaS under model extraction attacks. RShield integrates Reed-Solomon error-correcting codes with orthogonal feature mapping to introduce highly-structured redundancy, constructing fault-tolerant symbol sequences for multi-bit watermark space, thereby staying recoverable even after aggressive extraction noise condition.To mitigate semantic distortion under the interference of noise channel, RShield employs a lightweight Adapter to adaptively inject multi-bit watermarks in the feature space, preserving the quality of EaaS while achieving a user-level traceability.Extensive experiments on four NLP benchmarks demonstrate that RShield efficiently achieves 100% multi-bit watermark recovery and high semantic fidelity under model extraction attacks compared to existing methods, while significantly reducing the degradation of watermarking on downstream task performance.- Anthology ID:
- 2026.findings-acl.1347
- Volume:
- Findings of the Association for Computational Linguistics: ACL 2026
- Month:
- July
- Year:
- 2026
- Address:
- San Diego, California, United States
- Editors:
- Maria Liakata, Viviane P. Moreira, Jiajun Zhang, David Jurgens
- Venue:
- Findings
- SIG:
- Publisher:
- Association for Computational Linguistics
- Note:
- Pages:
- 27014–27028
- Language:
- URL:
- https://preview.aclanthology.org/ingest-acl-workshops/2026.findings-acl.1347/
- DOI:
- Cite (ACL):
- Lingyun Xiang, Yufan Zhong, Chengfu Ou, Zhihua Xia, Chunfang Yang, Daojian Zeng, and Zhangjie Fu. 2026. RShield: A User-level Traceable Backdoor Watermark for LLMs in Embedding-as-a-Service. In Findings of the Association for Computational Linguistics: ACL 2026, pages 27014–27028, San Diego, California, United States. Association for Computational Linguistics.
- Cite (Informal):
- RShield: A User-level Traceable Backdoor Watermark for LLMs in Embedding-as-a-Service (Xiang et al., Findings 2026)
- PDF:
- https://preview.aclanthology.org/ingest-acl-workshops/2026.findings-acl.1347.pdf