Hang fu
Other people with similar names: Hang Fu
2026
ImF: Embedding an Implicit Fingerprint in Your Large Language Models
Jiaxuan Wu | Wanli Peng | Hang fu | Xue Yiming | Juan Wen
Proceedings of the 64th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers)
Jiaxuan Wu | Wanli Peng | Hang fu | Xue Yiming | Juan Wen
Proceedings of the 64th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers)
Training and serving large language models (LLMs) is resource-intensive, making reliable intellectual property (IP) protection and black-box ownership verification increasingly important.Model fingerprinting enables such verification by injecting a small set of secret query–response behaviors, but many existing fingerprints rely on explicit markers or predetermined outputs that are weakly grounded in prompt semantics.This semantic mismatch yields atypical fingerprint responses, reduces stealthiness, and exposes fingerprints to removal by response normalization.We formalize this vulnerability via a new removal attack, Generation Revision Intervention (GRI), which applies system-prompt-level revision and response standardization to steer models toward typical answers, substantially compromising representative injected baselines.To close this semantic gap, we propose the Implicit Fingerprints (ImF): we encode ownership information into a natural-looking target response y via linguistic steganography, then derive a CoT-augmented query x that embeds semantic cues from y to guide the model toward an output sufficiently close to y for decoding-based verification.Experiments on 15 LLMs show that ImF improves stealthiness and remains verifiable under model updates and deployment-time prompt interventions; additional analyses further show stability under common decoding variation and realistic related-model partial merging.
Inhibitory Attacks on Backdoor-based Fingerprinting for Large Language Models
Hang fu | Wanli Peng | Yinghan Zhou | Jiaxuan Wu | Juan Wen | Xue Yiming
Proceedings of the 64th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers)
Hang fu | Wanli Peng | Yinghan Zhou | Jiaxuan Wu | Juan Wen | Xue Yiming
Proceedings of the 64th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers)
The widespread adoption of Large Language Model (LLM) in commercial and research settings has intensified the need for robust intellectual property protection. Backdoor-based LLM fingerprinting has emerged as a promising solution for this challenge. In practical application, the low-cost multi-model collaborative technique, LLM ensemble, combines diverse LLMs to leverage their complementary strengths, garnering significant attention and practical adoption. Unfortunately, the vulnerability of existing LLM fingerprinting for the ensemble scenario is unexplored. In order to comprehensively assess the robustness of LLM fingerprinting, in this paper, we propose two novel fingerprinting attack methods: token filter attack (TFA) and sentence verification attack (SVA). The TFA gets the next token from a unified set of tokens created by the token filter mechanism at each decoding step. The SVA filters out fingerprint responses through a sentence verification mechanism based on perplexity and voting. Experimentally, the proposed methods effectively inhibit the fingerprint response while maintaining ensemble performance. Compared with state-of-the-art attack methods, the proposed method can achieve better performance. The findings necessitate enhanced robustness in LLM fingerprinting.