Bin Benjamin Zhu


2025

pdf bib
ESF: Efficient Sensitive Fingerprinting for Black-Box Tamper Detection of Large Language Models
Xiaofan Bai | Pingyi Hu | Xiaojing Ma | Linchen Yu | Dongmei Zhang | Qi Zhang | Bin Benjamin Zhu
Findings of the Association for Computational Linguistics: ACL 2025

The rapid adoption of large language models (LLMs) in diverse applications has intensified concerns over their security and integrity, especially in cloud environments where internal model parameters are inaccessible to users. Traditional tamper detection methods, designed for deterministic classification models, fail to address the output randomness and massive parameter spaces characteristic of LLMs. In this paper, we introduce Efficient Sensitive Fingerprinting (ESF), the first fingerprinting method tailored for black-box tamper detection of LLMs. ESF generates fingerprint samples by optimizing output sensitivity at selected detection token positions and leverages Randomness-Set Consistency Checking (RSCC) to accommodate inherent output randomness. Furthermore, a novel Max Coverage Strategy (MCS) is proposed to select an optimal set of fingerprint samples that maximizes joint sensitivity to tampering. Grounded in a rigorous theoretical framework, ESF is both computationally efficient and scalable to large models. Extensive experiments across state-of-the-art LLMs demonstrate that ESF reliably detects tampering, such as fine-tuning, model compression, and backdoor injection, with a detection rate exceeding 99.2% using 5 fingerprint samples, thereby offering a robust solution for securing cloud-based AI systems.

2024

pdf bib
AMPO: Automatic Multi-Branched Prompt Optimization
Sheng Yang | Yurong Wu | Yan Gao | Zineng Zhou | Bin Benjamin Zhu | Xiaodi Sun | Jian-Guang Lou | Zhiming Ding | Anbang Hu | Yuan Fang | Yunsong Li | Junyan Chen | Linjun Yang
Proceedings of the 2024 Conference on Empirical Methods in Natural Language Processing

Prompt engineering is very important to enhance the performance of large language models (LLMs). When dealing with complex issues, prompt engineers tend to distill multiple patterns from examples and inject relevant solutions to optimize the prompts, achieving satisfying results. However, existing automatic prompt optimization techniques are only limited to producing single flow instructions, struggling with handling diverse patterns. In this paper, we present AMPO, an automatic prompt optimization method that can iteratively develop a multi-branched prompt using failure cases as feedback. Our goal is to explore a novel way of structuring prompts with multi-branches to better handle multiple patterns in complex tasks, for which we introduce three modules: Pattern Recognition, Branch Adjustment, and Branch Pruning. In experiments across five tasks, AMPO consistently achieves the best results. Additionally, our approach demonstrates significant optimization efficiency due to our adoption of a minimal search strategy.

pdf bib
StraGo: Harnessing Strategic Guidance for Prompt Optimization
Yurong Wu | Yan Gao | Bin Benjamin Zhu | Zineng Zhou | Xiaodi Sun | Sheng Yang | Jian-Guang Lou | Zhiming Ding | Linjun Yang
Findings of the Association for Computational Linguistics: EMNLP 2024

Prompt engineering is pivotal for harnessing the capabilities of large language models (LLMs) across diverse applications. While existing prompt optimization methods improve prompt effectiveness, they often lead to prompt drifting, wherein newly generated prompts canadversely impact previously successful cases while addressing failures. Furthermore, these methods tend to rely heavily on LLMs’ intrinsic capabilities for prompt optimization tasks. In this paper, we introduce STRAGO (StrategicGuided Optimization), a novel approach designed to mitigate prompt drifting by leveraging insights from both successful and failed cases to identify critical factors for achieving optimization objectives. STRAGO employs a how-to-do methodology, integrating in-context learning to formulate specific, actionable strategies that provide detailed, step-by-step guidance for prompt optimization. Extensive experiments conducted across a range of tasks, including reasoning, natural language understanding, domain-specific knowledge, and industrial applications, demonstrate STRAGO’s superior performance. It establishes a new stateof-the-art in prompt optimization, showcasing its ability to deliver stable and effective prompt improvements.