Wen-Juan Hou
Also published as: Wen Juan Hou, Juan Wen
2026
Inhibitory Attacks on Backdoor-based Fingerprinting for Large Language Models
Hang fu | Wanli Peng | Yinghan Zhou | Jiaxuan Wu | Juan Wen | Xue Yiming
Proceedings of the 64th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers)
Hang fu | Wanli Peng | Yinghan Zhou | Jiaxuan Wu | Juan Wen | Xue Yiming
Proceedings of the 64th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers)
The widespread adoption of Large Language Model (LLM) in commercial and research settings has intensified the need for robust intellectual property protection. Backdoor-based LLM fingerprinting has emerged as a promising solution for this challenge. In practical application, the low-cost multi-model collaborative technique, LLM ensemble, combines diverse LLMs to leverage their complementary strengths, garnering significant attention and practical adoption. Unfortunately, the vulnerability of existing LLM fingerprinting for the ensemble scenario is unexplored. In order to comprehensively assess the robustness of LLM fingerprinting, in this paper, we propose two novel fingerprinting attack methods: token filter attack (TFA) and sentence verification attack (SVA). The TFA gets the next token from a unified set of tokens created by the token filter mechanism at each decoding step. The SVA filters out fingerprint responses through a sentence verification mechanism based on perplexity and voting. Experimentally, the proposed methods effectively inhibit the fingerprint response while maintaining ensemble performance. Compared with state-of-the-art attack methods, the proposed method can achieve better performance. The findings necessitate enhanced robustness in LLM fingerprinting.
ImF: Embedding an Implicit Fingerprint in Your Large Language Models
Jiaxuan Wu | Wanli Peng | Hang fu | Xue Yiming | Juan Wen
Proceedings of the 64th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers)
Jiaxuan Wu | Wanli Peng | Hang fu | Xue Yiming | Juan Wen
Proceedings of the 64th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers)
Training and serving large language models (LLMs) is resource-intensive, making reliable intellectual property (IP) protection and black-box ownership verification increasingly important.Model fingerprinting enables such verification by injecting a small set of secret query–response behaviors, but many existing fingerprints rely on explicit markers or predetermined outputs that are weakly grounded in prompt semantics.This semantic mismatch yields atypical fingerprint responses, reduces stealthiness, and exposes fingerprints to removal by response normalization.We formalize this vulnerability via a new removal attack, Generation Revision Intervention (GRI), which applies system-prompt-level revision and response standardization to steer models toward typical answers, substantially compromising representative injected baselines.To close this semantic gap, we propose the Implicit Fingerprints (ImF): we encode ownership information into a natural-looking target response y via linguistic steganography, then derive a CoT-augmented query x that embeds semantic cues from y to guide the model toward an output sufficiently close to y for decoding-based verification.Experiments on 15 LLMs show that ImF improves stealthiness and remains verifiable under model updates and deployment-time prompt interventions; additional analyses further show stability under common decoding variation and realistic related-model partial merging.
2025
Kill two birds with one stone: generalized and robust AI-generated text detection via dynamic perturbations
Yinghan Zhou | Juan Wen | Wanli Peng | Xue Yiming | ZiWei Zhang | Wu Zhengxian
Proceedings of the 2025 Conference of the Nations of the Americas Chapter of the Association for Computational Linguistics: Human Language Technologies (Volume 1: Long Papers)
Yinghan Zhou | Juan Wen | Wanli Peng | Xue Yiming | ZiWei Zhang | Wu Zhengxian
Proceedings of the 2025 Conference of the Nations of the Americas Chapter of the Association for Computational Linguistics: Human Language Technologies (Volume 1: Long Papers)
The growing popularity of large language models has raised concerns regarding the potential to misuse AI-generated text (AIGT). It becomes increasingly critical to establish an excellent AIGT detection method with high generalization and robustness.While, existing methods either focus on model generalization or concentrate on robustness.The unified mechanism, to simultaneously address the challenges of generalization and robustness, is less explored. In this paper, we first empirically reveal an intrinsic mechanism for model generalization and robustness of AIGT detection task.Then, we proposed a novel AIGT detection method (DP-Net) via dynamic perturbations introduced by a reinforcement learning with elaborated reward and action.Experimentally, extensive results show that the proposed DP-Net significantly outperforms some state-of-the-art AIGT detection methods for generalization capacity in three cross-domain scenarios.Meanwhile, the DP-Net achieves best robustness under two text adversarial attacks.
2015
NTNU: An Unsupervised Knowledge Approach for Taxonomy Extraction
Bamfa Ceesay | Wen Juan Hou
Proceedings of the 9th International Workshop on Semantic Evaluation (SemEval 2015)
Bamfa Ceesay | Wen Juan Hou
Proceedings of the 9th International Workshop on Semantic Evaluation (SemEval 2015)
2013
Sentiment Classification for Movie Reviews in Chinese Using Parsing-based Methods
Wen-Juan Hou | Chuang-Ping Chang
Proceedings of the Sixth International Joint Conference on Natural Language Processing
Wen-Juan Hou | Chuang-Ping Chang
Proceedings of the Sixth International Joint Conference on Natural Language Processing
2006
Word Segmentation and Named Entity Recognition for SIGHAN Bakeoff3
Suxiang Zhang | Ying Qin | Juan Wen | Xiaojie Wang
Proceedings of the Fifth SIGHAN Workshop on Chinese Language Processing
Suxiang Zhang | Ying Qin | Juan Wen | Xiaojie Wang
Proceedings of the Fifth SIGHAN Workshop on Chinese Language Processing
Classifying Biological Full-Text Articles for Multi-Database Curation
Wen-Juan Hou | Chih Lee | Hsin-Hsi Chen
Demonstrations
Wen-Juan Hou | Chih Lee | Hsin-Hsi Chen
Demonstrations
2004
Annotating Multiple Types of Biomedical Entities: A Single Word Classification Approach
Chih Lee | Wen-Juan Hou | Hsin-Hsi Chen
Proceedings of the International Joint Workshop on Natural Language Processing in Biomedicine and its Applications (NLPBA/BioNLP)
Chih Lee | Wen-Juan Hou | Hsin-Hsi Chen
Proceedings of the International Joint Workshop on Natural Language Processing in Biomedicine and its Applications (NLPBA/BioNLP)
Support Vector Machine Approach to Extracting Gene References into Function from Biological Documents
Chih Lee | Wen-Juan Hou | Hsin-Hsi Chen
Proceedings of the International Joint Workshop on Natural Language Processing in Biomedicine and its Applications (NLPBA/BioNLP)
Chih Lee | Wen-Juan Hou | Hsin-Hsi Chen
Proceedings of the International Joint Workshop on Natural Language Processing in Biomedicine and its Applications (NLPBA/BioNLP)