Logic Jailbreak: Efficiently Unlocking LLM Safety Restrictions Through Formal Logical Expression

Jingyu Peng, Maolin Wang, Nan Wang, Jiatong Li, Yuchen Li, Yuyang Ye, Wanyu Wang, Pengyue Jia, Kai Zhang, Xiangyu Zhao


Abstract
Despite substantial advancements in aligning LLMs with human values, current safety mechanisms remain susceptible to jailbreak attacks. We attribute this vulnerability to the distributional discrepancies between alignment-oriented prompts and malicious prompts. To investigate this, and drawing inspiration from logic-driven NLP tasks, we introduce LogiBreak, a universal black-box jailbreak method that utilizes logical expression translation to bypass LLM safety mechanisms. By converting harmful natural language prompts into formal logical expressions, LogiBreak exploits the distributional gap between alignment data and logic-expressed inputs, preserving the underlying semantic intent and readability while evading safety constraints. Furthermore, to fill the gap of existing benchmarks that lack systematic resources specifically targeting logical expression-based attacks against LLM robustness, we construct a novel multilingual logical expression jailbreak dataset for evaluation. Our evaluations of LogiBreak in five languages demonstrate its effectiveness and generalizability in various linguistic contexts. The code is available at https://github.com/Applied-Machine-Learning-Lab/ACL2026_Logibreak.
Anthology ID:
2026.findings-acl.25
Volume:
Findings of the Association for Computational Linguistics: ACL 2026
Month:
July
Year:
2026
Address:
San Diego, California, United States
Editors:
Maria Liakata, Viviane P. Moreira, Jiajun Zhang, David Jurgens
Venue:
Findings
SIG:
Publisher:
Association for Computational Linguistics
Note:
Pages:
523–543
Language:
URL:
https://preview.aclanthology.org/ingest-acl/2026.findings-acl.25/
DOI:
Bibkey:
Cite (ACL):
Jingyu Peng, Maolin Wang, Nan Wang, Jiatong Li, Yuchen Li, Yuyang Ye, Wanyu Wang, Pengyue Jia, Kai Zhang, and Xiangyu Zhao. 2026. Logic Jailbreak: Efficiently Unlocking LLM Safety Restrictions Through Formal Logical Expression. In Findings of the Association for Computational Linguistics: ACL 2026, pages 523–543, San Diego, California, United States. Association for Computational Linguistics.
Cite (Informal):
Logic Jailbreak: Efficiently Unlocking LLM Safety Restrictions Through Formal Logical Expression (Peng et al., Findings 2026)
Copy Citation:
PDF:
https://preview.aclanthology.org/ingest-acl/2026.findings-acl.25.pdf
Checklist:
 2026.findings-acl.25.checklist.pdf