Yaxun Dai


2025

pdf bib
PARSQL: Enhancing Text-to-SQL through SQL Parsing and Reasoning
Yaxun Dai | Haiqin Yang | Mou Hao | Pingfu Chao
Findings of the Association for Computational Linguistics: ACL 2025

Large language models (LLMs) have made significant strides in text-to-SQL tasks; however, small language models (SLMs) are crucial due to their low resource consumption and efficient inference for real-world deployment. Due to resource limitations, SLMs struggle to accurately interpret natural language questions and may overlook critical constraints, leading to challenges such as generating SQL with incorrect logic or incomplete conditions. To address these issues, we propose PARSQL, a novel framework that leverages SQL parsing and reasoning. Specifically, we design PARSer, an SQL parser that extracts constraints from SQL to generate sub-SQLs for data augmentation and producing step-by-step SQL explanations (reason) via both rule-based and LLM-based methods. We define a novel text-to-reason task and incorporate it into multi-task learning, thereby enhancing text-to-SQL performance. Additionally, we employ an efficient SQL selection strategy that conducts direct similarity computation between the generated SQLs and their corresponding reasons to derive the final SQL for post-correction. Extensive experiments show that our PARSQL outperforms models with the same model size on the BIRD and Spider benchmarks. Notably, PARSQL-3B achieves 56.98% execution accuracy on BIRD, rivaling 7B models with significantly fewer parameters, setting a new state-of-the-art performance. Code can be found [here](https://github.com/yaxundai/parsql).