Linchen Yu
2025
ESF: Efficient Sensitive Fingerprinting for Black-Box Tamper Detection of Large Language Models
Xiaofan Bai
|
Pingyi Hu
|
Xiaojing Ma
|
Linchen Yu
|
Dongmei Zhang
|
Qi Zhang
|
Bin Benjamin Zhu
Findings of the Association for Computational Linguistics: ACL 2025
The rapid adoption of large language models (LLMs) in diverse applications has intensified concerns over their security and integrity, especially in cloud environments where internal model parameters are inaccessible to users. Traditional tamper detection methods, designed for deterministic classification models, fail to address the output randomness and massive parameter spaces characteristic of LLMs. In this paper, we introduce Efficient Sensitive Fingerprinting (ESF), the first fingerprinting method tailored for black-box tamper detection of LLMs. ESF generates fingerprint samples by optimizing output sensitivity at selected detection token positions and leverages Randomness-Set Consistency Checking (RSCC) to accommodate inherent output randomness. Furthermore, a novel Max Coverage Strategy (MCS) is proposed to select an optimal set of fingerprint samples that maximizes joint sensitivity to tampering. Grounded in a rigorous theoretical framework, ESF is both computationally efficient and scalable to large models. Extensive experiments across state-of-the-art LLMs demonstrate that ESF reliably detects tampering, such as fine-tuning, model compression, and backdoor injection, with a detection rate exceeding 99.2% using 5 fingerprint samples, thereby offering a robust solution for securing cloud-based AI systems.
Search
Fix author
Co-authors
- Xiaofan Bai 1
- Pingyi Hu 1
- Xiaojing Ma 1
- Dongmei Zhang 1
- Qi Zhang 1
- show all...