Bin Chen
Other people with similar names: Bin Chen, Bin Chen, Bin Chen
Unverified author pages with similar names: Bin Chen
2026
When Efficiency Meets Safety: A Benchmark Security Analysis of KV Cache Compression in Large Language Models
Xiaoxiao Ma | Kuofeng Gao | Zeyi Lu | Wenxi Jiang | Hao Fang | Hao Wu | Bin Chen | Shu-Tao Xia
Proceedings of the 64th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers)
Xiaoxiao Ma | Kuofeng Gao | Zeyi Lu | Wenxi Jiang | Hao Fang | Hao Wu | Bin Chen | Shu-Tao Xia
Proceedings of the 64th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers)
Key-Value (KV) caching is widely used in large language models (LLMs) to enable long-context inference efficiently, yet its security implications remain underexplored. We present the first systematic study of how KV cache compression interacts with jailbreak attacks, evaluating four model families under diverse jailbreak attacks. We identify a double-edged effect: (i) on one hand, compression can induce **Accidental Robustness**, where optimization-based and encoding-based attacks fail due to Malicious Semantic Eviction, where attacks’ own attention redirection reduces the malicious query’s cache importance, and Gradient Mismatch where discrete compression operations break jailbreak optimization. (ii) On the other hand, **Vulnerability Paradox** arises under merging-based compression for human-designed Attacks, where aggressive merging in shallow layers triggers functional head collapse, amplifying attack success rates. To address this, we propose **Safe-CAM**, a history-aware, per-head feedback merging strategy that prevents safety degradation while maintaining efficiency. Experiments show Safe-CAM fully restores safety (0% ASR) and improves benign task performance with minimal overhead. Our study highlights that KV cache compression is not only an efficiency mechanism but also a safety-critical design factor in LLM deployment.
Retrievals Can Be Detrimental: Unveiling the Backdoor Vulnerability of Retrieval-Augmented Diffusion Models
Hao Fang | Xiaohang Sui | Hongyao Yu | Kuofeng Gao | Jiawei Kong | Sijin Yu | Bin Chen | Shu-Tao Xia
Proceedings of the 64th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers)
Hao Fang | Xiaohang Sui | Hongyao Yu | Kuofeng Gao | Jiawei Kong | Sijin Yu | Bin Chen | Shu-Tao Xia
Proceedings of the 64th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers)
Diffusion models (DMs) have recently exhibited impressive generation capability. However, their training generally requires huge computational resources and large-scale datasets. To solve these, recent studies empower DMs with Retrieval-Augmented Generation (RAG), yielding retrieval-augmented diffusion models (RDMs) that enhance performance with reduced parameters. Despite the success, RAG may introduce novel security issues that warrant further investigation. In this paper, we propose BadRDM, the first poisoning framework targeting RDMs, to systematically investigate their vulnerability to backdoor attacks. Our framework fully considers RAG’s characteristics by manipulating the retrieved items for specific text triggers to ultimately control the generated outputs. Specifically, we first insert a tiny portion of images into the retrieval database as target toxicity surrogates. We then exploit the contrastive learning mechanism underlying retrieval models by designing a malicious variant that establishes robust shortcuts from triggers to toxicity surrogates. In addition, we introduce novel entropy-based selection and generative augmentation strategies for better toxicity surrogates. Extensive experiments on two mainstream tasks show that the proposed method achieves outstanding attack effects while preserving benign utility. Notably, BadRDM remains effective even under common defense strategies, further highlighting serious security concerns for RDMs.
C-ReD: A Comprehensive Chinese Benchmark for AI-Generated Text Detection Derived from Real-World Prompts
Chenxi Qing | Junxi Wu | Zheng Liu | Yixiang Qiu | Hongyao Yu | Bin Chen | Hao Wu | Shu-Tao Xia
Findings of the Association for Computational Linguistics: ACL 2026
Chenxi Qing | Junxi Wu | Zheng Liu | Yixiang Qiu | Hongyao Yu | Bin Chen | Hao Wu | Shu-Tao Xia
Findings of the Association for Computational Linguistics: ACL 2026
Recently, large language models (LLMs) are capable of generating highly fluent textual content. While they offer significant convenience to humans, they also introduce various risks, like phishing and academic dishonesty. Numerous research efforts have been dedicated to developing algorithms for detecting AI-generated text and constructing relevant datasets. However, in the domain of Chinese corpora, challenges remain, including limited model diversity and data homogeneity. To address these issues, we propose C-ReD: a comprehensive Chinese Real-prompt AI-generated text Detection benchmark. Experiments demonstrate that C-ReD not only enables reliable in-domain detection but also supports strong generalization to unseen LLMs and external Chinese datasets—addressing critical gaps in model diversity, domain coverage, and prompt realism that have limited prior Chinese detection benchmarks. We release our resources at https://github.com/HeraldofLight/C-ReD.
From Verbatim to Gist: Distilling Pyramidal Multimodal Memory via Semantic Information Bottleneck for Long-Horizon Video Agents
Niu Lian | Yuting Wang | Hanshu Yao | Jinpeng Wang | Bin Chen | Yaowei Wang | Min Zhang | Shu-Tao Xia
Proceedings of the 64th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers)
Niu Lian | Yuting Wang | Hanshu Yao | Jinpeng Wang | Bin Chen | Yaowei Wang | Min Zhang | Shu-Tao Xia
Proceedings of the 64th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers)
While multimodal large language models have demonstrated impressive short-term reasoning, they struggle with long-horizon video understanding due to limited context windows and static memory mechanisms that fail to mirror human cognitive efficiency. Existing paradigms typically fall into two extremes: vision-centric methods that incur high latency and redundancy through dense visual accumulation, or text-centric approaches that suffer from detail loss and hallucination via aggressive captioning. To bridge this gap, we propose **MM-Mem**, a pyramidal multimodal memory architecture grounded in *Fuzzy-Trace Theory*. **MM-Mem** structures memory hierarchically into a *Sensory Buffer*, *Episodic Stream*, and *Symbolic Schema*, enabling the progressive distillation of fine-grained perceptual traces (*verbatim*) into high-level semantic schemas (*gist*).Furthermore, to govern the dynamic construction of memory, we derive a Semantic Information Bottleneck objective and introduce SIB-GRPO to optimize the trade-off between memory compression and task-relevant information retention.In inference, we design an entropy-driven top-down memory retrieval strategy.Extensive experiments across 4 benchmarks confirm that **MM-Mem** achieves state-of-the-art performance on both offline and streaming tasks, demonstrating robust generalization and validating the effectiveness of cognition-inspired memory organization.Code and associated configurations are publicly available at ‘https://github.com/EliSpectre/MM-Mem‘.
Infinite Babble: Inflating 3D Vision-Language Model Inference Overhead via Adversarial Geometric Perturbation
Shuoyang Sun | Jiaxin Hong | Yv Zhang | Kuofeng Gao | Hao Fang | Fan Mo | Bin Chen | Shu-Tao Xia
Findings of the Association for Computational Linguistics: ACL 2026
Shuoyang Sun | Jiaxin Hong | Yv Zhang | Kuofeng Gao | Hao Fang | Fan Mo | Bin Chen | Shu-Tao Xia
Findings of the Association for Computational Linguistics: ACL 2026
3D Vision-Language Models (3D-VLMs) have emerged as the critical cognitive backbone for spatial intelligence, enabling precise reasoning over unstructured 3D data. While these models serve as the foundation for downstream robotics and embodied systems, their reliance on autoregressive decoding introduces a fundamental vulnerability regarding inference efficiency. In this work, we present Inflate3D, a novel adversarial framework designed to trigger computational and economic exhaustion in 3D-VLMs. Specifically, we exploit the model’s sensitivity to untrusted 3D assets to hijack the generation process. Inflate3D operates by injecting imperceptible noise that forces the model into a state of pathological verbosity, effectively stalling the inference pipeline. Our approach comprises two synergistic strategies: (1) a semantic-aware adversarial manipulation that leverages internal representations to selectively perturb semantically critical regions while preserving geometric structure, and (2) a trajectory disruption mechanism that manipulates token probabilities to suppress End-of-Sequence (EOS) emission, thereby prolonging decoding and inducing verbose outputs. Experiments on standard benchmarks show that Inflate3D amplifies output length and energy consumption by up to 6.45×, demonstrating a potent capability to drain system resources. These findings expose a critical blind spot in multimodal alignment, highlighting the urgent need to secure spatial foundation models against resource exhaustion attacks.
HiPrune: Hierarchical Attention for Efficient Token Pruning in Vision-Language Models
Jizhihui Liu | Guangdao Zhu | Feiyi Du | Niu Lian | Jun Li | Bin Chen | Weili Guan | Yaowei Wang
Findings of the Association for Computational Linguistics: ACL 2026
Jizhihui Liu | Guangdao Zhu | Feiyi Du | Niu Lian | Jun Li | Bin Chen | Weili Guan | Yaowei Wang
Findings of the Association for Computational Linguistics: ACL 2026
Vision-Language Models (VLMs) encode images and videos into abundant tokens, which contain substantial redundancy and computation cost. While visual token pruning mitigates the issue, most existing methods lack insight into the intrinsic property of the vision encoder itself. In this work, we dive into the vision encoder and prove that the middle layers pay more attention to the main objects of the image qualitatively and quantitatively, while the deep layers to tokens with rich global information. Utilizing this Hierarchical attention pattern, we propose HiPrune, a training-free and model-agnostic token Pruning method. HiPrune identifies three types of visual tokens according to their attention in different phases of the vision encoder, which preserves different levels of information. By coupling with the similarity of text tokens, we propose a prompt-aware variance, HiPrune++, which further improves instruction following performance under a very low token budget. Extensive experiments across four representative VLMs show that HiPrune achieves up to 99.3% of task accuracy with only 1/3 of the tokens, while reducing inference FLOPs by 58.7%. HiPrune++ maintains up to 99.9% accuracy with 2/9 tokens, highlighting robustness under high-resolution.
2025
Your Language Model Can Secretly Write Like Humans: Contrastive Paraphrase Attacks on LLM-Generated Text Detectors
Hao Fang | Jiawei Kong | Tianqu Zhuang | Yixiang Qiu | Kuofeng Gao | Bin Chen | Shu-Tao Xia | Yaowei Wang | Min Zhang
Proceedings of the 2025 Conference on Empirical Methods in Natural Language Processing
Hao Fang | Jiawei Kong | Tianqu Zhuang | Yixiang Qiu | Kuofeng Gao | Bin Chen | Shu-Tao Xia | Yaowei Wang | Min Zhang
Proceedings of the 2025 Conference on Empirical Methods in Natural Language Processing
The misuse of large language models (LLMs), such as academic plagiarism, has driven the development of detectors to identify LLM-generated texts. To bypass these detectors, paraphrase attacks have emerged to purposely rewrite these texts to evade detection. Despite the success, existing methods require substantial data and computational budgets to train a specialized paraphraser, and their attack efficacy greatly reduces when faced with advanced detection algorithms. To address this, we propose Contrastive Paraphrase Attack (CoPA), a training-free method that effectively deceives text detectors using off-the-shelf LLMs. The first step is to carefully craft instructions that encourage LLMs to produce more human-like texts. Nonetheless, we observe that the inherent statistical biases of LLMs can still result in some generated texts carrying certain machine-like attributes that can be captured by detectors. To overcome this, CoPA constructs an auxiliary machine-like word distribution as a contrast to the human-like distribution generated by the LLM. By subtracting the machine-like patterns from the human-like distribution during the decoding process, CoPA is able to produce sentences that are less discernible by text detectors. Our theoretical analysis suggests the superiority of the proposed attack. Extensive experiments validate the effectiveness of CoPA in fooling text detectors across various scenarios.
MoSEs: Uncertainty-Aware AI-Generated Text Detection via Mixture of Stylistics Experts with Conditional Thresholds
Junxi Wu | Jinpeng Wang | Zheng Liu | Bin Chen | Dongjian Hu | Hao Wu | Shu-Tao Xia
Proceedings of the 2025 Conference on Empirical Methods in Natural Language Processing
Junxi Wu | Jinpeng Wang | Zheng Liu | Bin Chen | Dongjian Hu | Hao Wu | Shu-Tao Xia
Proceedings of the 2025 Conference on Empirical Methods in Natural Language Processing
The rapid advancement of large language models has intensified public concerns about the potential misuse. Therefore, it is important to build trustworthy AI-generated text detection systems. Existing methods neglect stylistic modeling and mostly rely on static thresholds, which greatly limits the detection performance. In this paper, we propose the Mixture of Stylistic Experts (MoSEs) framework that enables stylistics-aware uncertainty quantification through conditional threshold estimation. MoSEs contain three core components, namely, the Stylistics Reference Repository (SRR), the Stylistics-Aware Router (SAR), and the Conditional Threshold Estimator (CTE). For input text, SRR can activate the appropriate reference data in SRR and provide them to CTE. Subsequently, CTE jointly models the linguistic statistical properties and semantic features to dynamically determine the optimal threshold. With a discrimination score, MoSEs yields prediction labels with the corresponding confidence level. Our framework achieves an average improvement 11.34% in detection performance compared to baselines. More inspiringly, MoSEs shows a more evident improvement 39.15% in the low-resource case. Our code is available at https://github.com/creator-xi/MoSEs.
Modeling Uncertainty in Composed Image Retrieval via Probabilistic Embeddings
Haomiao Tang | Jinpeng Wang | Yuang Peng | GuangHao Meng | Ruisheng Luo | Bin Chen | Long Chen | Yaowei Wang | Shu-Tao Xia
Proceedings of the 63rd Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers)
Haomiao Tang | Jinpeng Wang | Yuang Peng | GuangHao Meng | Ruisheng Luo | Bin Chen | Long Chen | Yaowei Wang | Shu-Tao Xia
Proceedings of the 63rd Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers)
Composed Image Retrieval (CIR) enables users to search for images using multimodal queries that combine text and reference images. While metric learning methods have shown promise, they rely on deterministic point embeddings that fail to capture the inherent uncertainty in the input data, in which user intentions may be imprecisely specified or open to multiple interpretations. We address this challenge by reformulating CIR through our proposed Composed Probabilistic Embedding (CoPE) framework, which represents both queries and targets as Gaussian distributions in latent space rather than fixed points. Through careful design of probabilistic distance metrics and hierarchical learning objectives, CoPE explicitly captures uncertainty at both instance and feature levels, enabling more flexible, nuanced, and robust matching that can handle polysemy and ambiguity in search intentions. Extensive experiments across multiple benchmarks demonstrate that CoPE effectively quantifies both quality and semantic uncertainties within Composed Image Retrieval, achieving state-of-the-art performance on recall rate. Code: https://github.com/tanghme0w/ACL25-CoPE.
Search
Fix author
Co-authors
- Shu-Tao Xia 8
- Hao Fang 4
- Kuofeng Gao 4
- Yaowei Wang 4
- Hao Wu 3
- Jiawei Kong 2
- Niu Lian 2
- Yixiang Qiu 2
- Jinpeng Wang 2
- Junxi Wu 2
- Hongyao Yu 2
- Min Zhang 2
- Long Chen 1
- Feiyi Du 1
- Weili Guan 1
- Jiaxin Hong 1
- Dongjian Hu 1
- Wenxi Jiang 1
- Jun Li (李俊) 1
- Zheng Liu 1
- Jizhihui Liu 1
- Zheng Liu 1
- Zeyi Lu 1
- Ruisheng Luo 1
- Xiaoxiao Ma 1
- GuangHao Meng 1
- Fan Mo 1
- Yuang Peng 1
- Chenxi Qing 1
- Xiaohang Sui 1
- Shuoyang Sun 1
- Haomiao Tang 1
- Yuting Wang 1
- Jinpeng Wang 1
- Hanshu Yao 1
- Sijin Yu 1
- Yv Zhang 1
- Guangdao Zhu 1
- Tianqu Zhuang 1