Optimizing Reasoning for Text-to-SQL with Execution Feedback

Bohan Zhai; Canwen Xu; Yuxiong He; Zhewei Yao

doi:10.18653/v1/2025.findings-acl.982

Optimizing Reasoning for Text-to-SQL with Execution Feedback

Bohan Zhai, Canwen Xu, Yuxiong He, Zhewei Yao

Abstract

Text-to-SQL demands precise reasoning to convert natural language questions into structured queries. While large language models (LLMs) excel in many reasoning tasks, their ability to leverage Chain-of-Thought (CoT) reasoning for text-to-SQL remains underexplored. We identify critical limitations: zero-shot CoT offers minimal gains, and Direct Preference Optimization (DPO) applied without CoT yields marginal improvements. We propose ExCoT-DPO, a novel framework that iteratively optimizes open-source LLMs by combining CoT reasoning with off-policy and on-policy DPO, relying solely on execution accuracy as feedback. This approach eliminates the need for reward models or human-annotated preferences. Our experimental results demonstrate significant performance gains: ExCoT-DPO improves execution accuracy on BIRD from 57.37% to 68.51% and on Spider from 78.81% to 86.59% for LLaMA-3 70B, with Qwen-2.5-Coder demonstrating similar improvements. Our best model achieves state-of-the-art performance in the single-model setting on both BIRD and Spider datasets.

Anthology ID:: 2025.findings-acl.982
Volume:: Findings of the Association for Computational Linguistics: ACL 2025
Month:: July
Year:: 2025
Address:: Vienna, Austria
Editors:: Wanxiang Che, Joyce Nabende, Ekaterina Shutova, Mohammad Taher Pilehvar
Venue:: Findings
SIG:
Publisher:: Association for Computational Linguistics
Note:
Pages:: 19206–19218
Language:
URL:: https://preview.aclanthology.org/ingest-emnlp/2025.findings-acl.982/
DOI:: 10.18653/v1/2025.findings-acl.982
Bibkey:
Cite (ACL):: Bohan Zhai, Canwen Xu, Yuxiong He, and Zhewei Yao. 2025. Optimizing Reasoning for Text-to-SQL with Execution Feedback. In Findings of the Association for Computational Linguistics: ACL 2025, pages 19206–19218, Vienna, Austria. Association for Computational Linguistics.
Cite (Informal):: Optimizing Reasoning for Text-to-SQL with Execution Feedback (Zhai et al., Findings 2025)
Copy Citation:
PDF:: https://preview.aclanthology.org/ingest-emnlp/2025.findings-acl.982.pdf

PDF Cite Search Fix data