Chengpeng Wang
2026
HintPilot: LLM-based Compiler Hint Synthesis for Code Optimization
Hanyun Jiang | Peisen Yao | Kaiyue Li | Tingting Lin | Chengpeng Wang | Kui Ren
Findings of the Association for Computational Linguistics: ACL 2026
Hanyun Jiang | Peisen Yao | Kaiyue Li | Tingting Lin | Chengpeng Wang | Kui Ren
Findings of the Association for Computational Linguistics: ACL 2026
Code optimization remains a core objective in software development, yet modern compilers struggle to navigate the enormous optimization spaces. While recent research has looked into employing large language models (LLMs) to optimize source code directly, these techniques can introduce semantic errors and miss fine-grained compiler-level optimization opportunities. We present HintPilot, which bridges LLM-based reasoning with traditional compiler infrastructures via synthesizing compiler hints—annotations that steer compiler behavior. HintPilot employs retrieval-augmented synthesis over compiler documentation and applies profiling-guided iterative refinement to synthesize semantics-preserving and effective hints. Upon PolyBench and HumanEval-CPP benchmarks, HintPilot achieves up to 6.88x geometric mean speedup over while preserving program correctness.
Raw Pointer Rewriting with LLMs for Translating C to Safer Rust
Yifei Gao | Chengpeng Wang | Pengxiang Huang | Xuwei Liu | Mingwei Zheng | Xiangyu Zhang
Findings of the Association for Computational Linguistics: ACL 2026
Yifei Gao | Chengpeng Wang | Pengxiang Huang | Xuwei Liu | Mingwei Zheng | Xiangyu Zhang
Findings of the Association for Computational Linguistics: ACL 2026
There has been a growing interest in translating C code to Rust due to Rust’s robust memory and thread safety guarantees. Tools such as C2Rust enable syntax-guided transpilation from C to semantically equivalent Rust code. However, the resulting Rust programs often rely heavily on unsafe constructs, particularly raw pointers, which undermines Rust’s safety guarantees. This paper aims to improve the memory safety of Rust programs generated by C2Rust by eliminating raw pointers. Specifically, we propose a raw pointer rewriting technique that lifts raw pointers in individual functions to appropriate Rust data structures. Technically, PR2 employs decision-tree-based prompting to guide the pointer lifting process. It also leverages code change analysis to guide the repair of errors introduced during rewriting, effectively addressing errors encountered during compilation and test case execution.We implement PR2 and evaluate it using gpt-4o-mini on 28 real-world C projects. It is shown that PR2 successfully eliminates 18.57% of local raw pointers across these projects, significantly enhancing the safety of the translated Rust code. On average, PR2 completes the transformation of a project in 5.02 hours, at a cost of $1.13. Our code is available at https://github.com/bhcsayx/PR2.
2024
Sanitizing Large Language Models in Bug Detection with Data-Flow
Chengpeng Wang | Wuqi Zhang | Zian Su | Xiangzhe Xu | Xiangyu Zhang
Findings of the Association for Computational Linguistics: EMNLP 2024
Chengpeng Wang | Wuqi Zhang | Zian Su | Xiangzhe Xu | Xiangyu Zhang
Findings of the Association for Computational Linguistics: EMNLP 2024
Large language models (LLMs) show potential in code reasoning tasks, facilitating the customization of detecting bugs in software development. However, the hallucination effect can significantly compromise the reliability of bug reports. This work formulates a new schema of bug detection and presents a novel sanitization technique that detects false positives for hallucination mitigation. Our key idea is to enforce LLMs to emit data-flow paths in few-shot chain-of-thought prompting and validate them via the program-property decomposition. Specifically, we dissect data-flow paths into basic properties upon concise code snippets and leverage parsing-based analysis and LLMs for validation. Our approach averagely achieves 91.03% precision and 74.00% recall upon synthetic benchmarks and boosts the precision by 21.99% with the sanitization. The evaluation upon real-world Android malware applications also demonstrates the superiority over an industrial analyzer, surpassing the precision and recall by 15.36% and 3.61%, respectively.