MultiCodeAttack: Iterative Jailbreak Attacking on LLMs with Multi-Code Prompt Injection
Weifeng Sun, Meng Yan, Zhou Yang, Yuchen Chen, Song Sun, David Lo
Abstract
Large Language Models (LLMs) demonstrate strong generalization capabilities but remain vulnerable to jailbreak attacks that induce restricted text or malicious code generation.Recent structured jailbreaks embed adversarial intent into code-like templates and have demonstrated promising effectiveness.However, existing approaches typically operate within a fixed template design and a single programming language, without considering language diversity or adaptive template evolution, thereby limiting the exploration of cross-language jailbreak behaviors.In this paper, we present MultiCodeAttack, a structured jailbreak framework that systematically explores and optimizes multi-language code templates.MultiCodeAttack maintains a diverse template library across programming languages, dynamically selects languages with higher attack effectiveness via a multi-armed bandit strategy, and evolves templates through semantic-preserving mutation guided by response-aware signals.Extensive experiments on 8 LLMs show that MultiCodeAttack outperforms existing jailbreak baselines, achieving 28.23%–832.59% higher harmful text generation.On malicious code generation across 11 LLMs, MultiCodeAttack produces up to 136.22% more malicious outputs than the baseline methods.Our code is available at https://anonymous.4open.science/r/MultiCodeAttack/.- Anthology ID:
- 2026.findings-acl.721
- Volume:
- Findings of the Association for Computational Linguistics: ACL 2026
- Month:
- July
- Year:
- 2026
- Address:
- San Diego, California, United States
- Editors:
- Maria Liakata, Viviane P. Moreira, Jiajun Zhang, David Jurgens
- Venue:
- Findings
- SIG:
- Publisher:
- Association for Computational Linguistics
- Note:
- Pages:
- 14670–14690
- Language:
- URL:
- https://preview.aclanthology.org/ingest-acl/2026.findings-acl.721/
- DOI:
- Cite (ACL):
- Weifeng Sun, Meng Yan, Zhou Yang, Yuchen Chen, Song Sun, and David Lo. 2026. MultiCodeAttack: Iterative Jailbreak Attacking on LLMs with Multi-Code Prompt Injection. In Findings of the Association for Computational Linguistics: ACL 2026, pages 14670–14690, San Diego, California, United States. Association for Computational Linguistics.
- Cite (Informal):
- MultiCodeAttack: Iterative Jailbreak Attacking on LLMs with Multi-Code Prompt Injection (Sun et al., Findings 2026)
- PDF:
- https://preview.aclanthology.org/ingest-acl/2026.findings-acl.721.pdf