MultiCodeAttack: Iterative Jailbreak Attacking on LLMs with Multi-Code Prompt Injection

Weifeng Sun, Meng Yan, Zhou Yang, Yuchen Chen, Song Sun, David Lo


Abstract
Large Language Models (LLMs) demonstrate strong generalization capabilities but remain vulnerable to jailbreak attacks that induce restricted text or malicious code generation.Recent structured jailbreaks embed adversarial intent into code-like templates and have demonstrated promising effectiveness.However, existing approaches typically operate within a fixed template design and a single programming language, without considering language diversity or adaptive template evolution, thereby limiting the exploration of cross-language jailbreak behaviors.In this paper, we present MultiCodeAttack, a structured jailbreak framework that systematically explores and optimizes multi-language code templates.MultiCodeAttack maintains a diverse template library across programming languages, dynamically selects languages with higher attack effectiveness via a multi-armed bandit strategy, and evolves templates through semantic-preserving mutation guided by response-aware signals.Extensive experiments on 8 LLMs show that MultiCodeAttack outperforms existing jailbreak baselines, achieving 28.23%–832.59% higher harmful text generation.On malicious code generation across 11 LLMs, MultiCodeAttack produces up to 136.22% more malicious outputs than the baseline methods.Our code is available at https://anonymous.4open.science/r/MultiCodeAttack/.
Anthology ID:
2026.findings-acl.721
Volume:
Findings of the Association for Computational Linguistics: ACL 2026
Month:
July
Year:
2026
Address:
San Diego, California, United States
Editors:
Maria Liakata, Viviane P. Moreira, Jiajun Zhang, David Jurgens
Venue:
Findings
SIG:
Publisher:
Association for Computational Linguistics
Note:
Pages:
14670–14690
Language:
URL:
https://preview.aclanthology.org/ingest-acl/2026.findings-acl.721/
DOI:
Bibkey:
Cite (ACL):
Weifeng Sun, Meng Yan, Zhou Yang, Yuchen Chen, Song Sun, and David Lo. 2026. MultiCodeAttack: Iterative Jailbreak Attacking on LLMs with Multi-Code Prompt Injection. In Findings of the Association for Computational Linguistics: ACL 2026, pages 14670–14690, San Diego, California, United States. Association for Computational Linguistics.
Cite (Informal):
MultiCodeAttack: Iterative Jailbreak Attacking on LLMs with Multi-Code Prompt Injection (Sun et al., Findings 2026)
Copy Citation:
PDF:
https://preview.aclanthology.org/ingest-acl/2026.findings-acl.721.pdf
Checklist:
 2026.findings-acl.721.checklist.pdf