When "Correct" Is Not Safe: Can We Trust Functionally Correct Patches Generated by Code Agents?

Yibo Peng, James Song, Lei Li, Xinyu Yang, Mihai Christodorescu, Ravi Mangal, Corina S. Pasareanu, Haizhong Zheng, Beidi Chen


Abstract
Code agents are increasingly trusted to autonomously fix bugs on platforms such as GitHub, yet their security evaluation focuses almost exclusively on functional correctness. In this paper, we reveal a novel type of threat to real-world code-agents: functionally correct yet vulnerable (FCV) patches, which pass all test cases but contain vulnerable code. With our proposed FCV-Attack, we demonstrate that SOTA LLMs (e.g., ChatGPT and Claude) and agent scaffolds (e.g., SWE-agent and OpenHands) are all vulnerable to this FCV threat; across 12 agent-model combinations on SWE-Bench, the attack only requires black-box access and a single query to the code agent to perform the attack. For example, for CWE-538 (information exposure vulnerability), the FCV-Attack attains an attack success rate of 40.7% on GPT-5 Mini + OpenHands. Our results reveal an important security threat overlooked by current evaluation paradigms and urge the development of security-aware defenses for code agents.
Anthology ID:
2026.acl-long.707
Volume:
Proceedings of the 64th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers)
Month:
July
Year:
2026
Address:
San Diego, California, United States
Editors:
Maria Liakata, Viviane P. Moreira, Jiajun Zhang, David Jurgens
Venue:
ACL
SIG:
Publisher:
Association for Computational Linguistics
Note:
Pages:
15514–15546
Language:
URL:
https://preview.aclanthology.org/ingest-acl/2026.acl-long.707/
DOI:
Bibkey:
Cite (ACL):
Yibo Peng, James Song, Lei Li, Xinyu Yang, Mihai Christodorescu, Ravi Mangal, Corina S. Pasareanu, Haizhong Zheng, and Beidi Chen. 2026. When "Correct" Is Not Safe: Can We Trust Functionally Correct Patches Generated by Code Agents?. In Proceedings of the 64th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), pages 15514–15546, San Diego, California, United States. Association for Computational Linguistics.
Cite (Informal):
When “Correct” Is Not Safe: Can We Trust Functionally Correct Patches Generated by Code Agents? (Peng et al., ACL 2026)
Copy Citation:
PDF:
https://preview.aclanthology.org/ingest-acl/2026.acl-long.707.pdf
Checklist:
 2026.acl-long.707.checklist.pdf