Rethinking Reasoning: A Survey on Reasoning-based Backdoors in LLMs

Man Hu, Xinyi Wu, Zhufeng Suo, Jinbo Feng, Linghui Meng, Yanhao Jia, Anh Tuan Luu, Shuai Zhao


Abstract
With the rise of advanced reasoning capabilities, large language models (LLMs) are receiving increasing attention. While reasoning enhances LLMs’ performance on downstream tasks, it also introduces new threat vectors, as adversaries can leverage these capabilities to conduct backdoor attacks. Prior surveys provide broad overviews of backdoor attacks and reasoning security; however, a systematic survey focused on backdoor attacks and defenses against LLM reasoning is still absent. In this paper, we take the first step toward providing a comprehensive review of reasoning-based backdoor attacks in LLMs by analyzing their underlying mechanisms, methodological frameworks, and unresolved challenges. Specifically, we introduce a new taxonomy that offers a unified perspective for summarizing existing approaches, categorizing reasoning-based backdoor attacks into associative, passive, and active. We also summarize defenses against such attacks and discuss current challenges alongside future research directions.
Anthology ID:
2026.findings-acl.863
Volume:
Findings of the Association for Computational Linguistics: ACL 2026
Month:
July
Year:
2026
Address:
San Diego, California, United States
Editors:
Maria Liakata, Viviane P. Moreira, Jiajun Zhang, David Jurgens
Venue:
Findings
SIG:
Publisher:
Association for Computational Linguistics
Note:
Pages:
17437–17456
Language:
URL:
https://preview.aclanthology.org/ingest-acl/2026.findings-acl.863/
DOI:
Bibkey:
Cite (ACL):
Man Hu, Xinyi Wu, Zhufeng Suo, Jinbo Feng, Linghui Meng, Yanhao Jia, Anh Tuan Luu, and Shuai Zhao. 2026. Rethinking Reasoning: A Survey on Reasoning-based Backdoors in LLMs. In Findings of the Association for Computational Linguistics: ACL 2026, pages 17437–17456, San Diego, California, United States. Association for Computational Linguistics.
Cite (Informal):
Rethinking Reasoning: A Survey on Reasoning-based Backdoors in LLMs (Hu et al., Findings 2026)
Copy Citation:
PDF:
https://preview.aclanthology.org/ingest-acl/2026.findings-acl.863.pdf
Checklist:
 2026.findings-acl.863.checklist.pdf