Chain of Attack: Hide Your Intention through Multi-Turn Interrogation
Xikang Yang, Biyu Zhou, Xuehai Tang, Jizhong Han, Songlin Hu
Abstract
The latent knowledge of large language models (LLMs) contains harmful or unethical content, which introduces significant security risks upon their widespread deployment. Conducting jailbreak attacks on LLMs can proactively identify vulnerabilities to enhance their security measures. However, previous jailbreak attacks primarily focus on single-turn dialogue scenarios, leaving vulnerabilities in multi-turn dialogue contexts inadequately explored. This paper investigates the resilience of black-box LLMs in multi-turn jailbreak attack scenarios from a novel interrogation perspective. We propose an optimal interrogation principle to conceal the jailbreak intent and introduce a multi-turn attack chain generation strategy called CoA. By employing two effective interrogation strategies tailored for LLMs, coupled with an interrogation history record management mechanis, it achieves a significant optimization of the attack process. Our approach enables the iterative generation of attack chains, offering a powerful tool for LLM red team testing. Experimental results demonstrate that LLMs exhibit insufficient resistance under multi-turn interrogation, with our method shows more advantages(ASR, 83% vs 64%). This work offers new insights into improving the safety of LLMs.- Anthology ID:
- 2025.findings-acl.514
- Volume:
- Findings of the Association for Computational Linguistics: ACL 2025
- Month:
- July
- Year:
- 2025
- Address:
- Vienna, Austria
- Editors:
- Wanxiang Che, Joyce Nabende, Ekaterina Shutova, Mohammad Taher Pilehvar
- Venues:
- Findings | WS
- SIG:
- Publisher:
- Association for Computational Linguistics
- Note:
- Pages:
- 9881–9901
- Language:
- URL:
- https://preview.aclanthology.org/acl25-workshop-ingestion/2025.findings-acl.514/
- DOI:
- Cite (ACL):
- Xikang Yang, Biyu Zhou, Xuehai Tang, Jizhong Han, and Songlin Hu. 2025. Chain of Attack: Hide Your Intention through Multi-Turn Interrogation. In Findings of the Association for Computational Linguistics: ACL 2025, pages 9881–9901, Vienna, Austria. Association for Computational Linguistics.
- Cite (Informal):
- Chain of Attack: Hide Your Intention through Multi-Turn Interrogation (Yang et al., Findings 2025)
- PDF:
- https://preview.aclanthology.org/acl25-workshop-ingestion/2025.findings-acl.514.pdf