Integrating Symbolic Execution into the Fine-Tuning of Code-Generating LLMs

Marina Sakharova, Abhinav Anand, Mira Mezini


Abstract
Code-generating Large Language Models (LLMs) have become essential tools in modern software development, enhancing productivity and accelerating development. This paper aims to investigate the fine-tuning of code-generating LLMs using Reinforcement Learning and Direct Preference Optimization, further improving their performance. To achieve this, we enhance the training data for the reward model with the help of symbolic execution techniques, ensuring more comprehensive and objective data. With symbolic execution, we create a custom dataset that better captures the nuances in code evaluation. Our reward models, fine-tuned on this dataset, demonstrate significant improvements over the baseline, CodeRL, in estimating the quality of generated code. Our code-generating LLMs, trained with the help of reward model feedback, achieve similar results compared to the CodeRL benchmark.
Anthology ID:
2025.naacl-srw.27
Volume:
Proceedings of the 2025 Conference of the Nations of the Americas Chapter of the Association for Computational Linguistics: Human Language Technologies (Volume 4: Student Research Workshop)
Month:
April
Year:
2025
Address:
Albuquerque, USA
Editors:
Abteen Ebrahimi, Samar Haider, Emmy Liu, Sammar Haider, Maria Leonor Pacheco, Shira Wein
Venues:
NAACL | WS
SIG:
Publisher:
Association for Computational Linguistics
Note:
Pages:
271–278
Language:
URL:
https://preview.aclanthology.org/fix-sig-urls/2025.naacl-srw.27/
DOI:
Bibkey:
Cite (ACL):
Marina Sakharova, Abhinav Anand, and Mira Mezini. 2025. Integrating Symbolic Execution into the Fine-Tuning of Code-Generating LLMs. In Proceedings of the 2025 Conference of the Nations of the Americas Chapter of the Association for Computational Linguistics: Human Language Technologies (Volume 4: Student Research Workshop), pages 271–278, Albuquerque, USA. Association for Computational Linguistics.
Cite (Informal):
Integrating Symbolic Execution into the Fine-Tuning of Code-Generating LLMs (Sakharova et al., NAACL 2025)
Copy Citation:
PDF:
https://preview.aclanthology.org/fix-sig-urls/2025.naacl-srw.27.pdf