Lifting Optimized Binaries to Canonical Compiler IR via Structure-Aware Retrieval and Iterative Verification
Xiaoao Zhu, Jie Ren, Zhiqiang Li, Jie Zheng, Zhanyong Tang, Zheng Wang
Abstract
Lifting stripped and highly optimized binaries to the canonical compiler intermediate representation (IR) enables program analysis when source code is unavailable. However, compiler optimizations severely distort control-flow and data-flow structure, making existing rule-based and LLM-based decompilation approaches brittle. We present BRIDGE, a system that reliably lifts optimized binaries to analysis-friendly compiler IR. BRIDGE combines control-flow-aware retrieval-augmented generation with feedback-driven verification. It uses pseudo-probe instrumentation to align optimized binary fragments with normalized IR semantics, and then employs an iterative refinement loop guided by static analysis and runtime feedback to improve executability and semantic consistency. We evaluate BRIDGE on HumanEval-Decompile and MBPP, lifting x86-64 and ARM64 binaries to LLVM IR. BRIDGE outperforms seven baselines, achieving an average of over 30% higher re-executability than the strongest general-purpose LLM baseline.- Anthology ID:
- 2026.acl-long.527
- Volume:
- Proceedings of the 64th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers)
- Month:
- July
- Year:
- 2026
- Address:
- San Diego, California, United States
- Editors:
- Maria Liakata, Viviane P. Moreira, Jiajun Zhang, David Jurgens
- Venue:
- ACL
- SIG:
- Publisher:
- Association for Computational Linguistics
- Note:
- Pages:
- 11498–11516
- Language:
- URL:
- https://preview.aclanthology.org/ingest-acl/2026.acl-long.527/
- DOI:
- Cite (ACL):
- Xiaoao Zhu, Jie Ren, Zhiqiang Li, Jie Zheng, Zhanyong Tang, and Zheng Wang. 2026. Lifting Optimized Binaries to Canonical Compiler IR via Structure-Aware Retrieval and Iterative Verification. In Proceedings of the 64th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), pages 11498–11516, San Diego, California, United States. Association for Computational Linguistics.
- Cite (Informal):
- Lifting Optimized Binaries to Canonical Compiler IR via Structure-Aware Retrieval and Iterative Verification (Zhu et al., ACL 2026)
- PDF:
- https://preview.aclanthology.org/ingest-acl/2026.acl-long.527.pdf