Sheng Guan
Other people with similar names: Sheng Guan
Unverified author pages with similar names: Sheng Guan
2026
SQL-Trail: Multi-Turn Reinforcement Learning with Interleaved Feedback for Text-to-SQL
Harper Hua | Zhen Han | Zhengyuan Shen | Meng-Chieh Lee | Sheng Guan | Qi Zhu | Sullam Jeoung | Yueyan Chen | Yunfei Bai | Shuai Wang | Vassilis N. Ioannidis | Huzefa Rangwala
Proceedings of the 64th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers)
Harper Hua | Zhen Han | Zhengyuan Shen | Meng-Chieh Lee | Sheng Guan | Qi Zhu | Sullam Jeoung | Yueyan Chen | Yunfei Bai | Shuai Wang | Vassilis N. Ioannidis | Huzefa Rangwala
Proceedings of the 64th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers)
While large language models (LLMs) have substantially improved Text-to-SQL generation, a pronounced gap remains between AI systems and human experts on challenging benchmarks such as BIRD-SQL. We argue this gap stems largely from the prevailing single-pass paradigm, which lacks the iterative reasoning, schema exploration, and error-correction behaviors that humans naturally employ. To address this limitation, we introduce SQL-Trail, a multi-turn reinforcement learning (RL) agentic framework for Text-to-SQL. Rather than producing a query in one shot, SQL-Trail interacts with the database environment and uses execution feedback to iteratively refine its predictions. Our approach centers on two key ideas: (i) an adaptive turn-budget allocation mechanism that scales the agent’s interaction depth to match question difficulty, and (ii) a composite reward panel that jointly incentivizes SQL correctness and efficient exploration. Across benchmarks, SQL-Trail sets a new state of the art and delivers strong data efficiency—up to **18×** higher than prior single-pass RL state-of-the-art methods. Notably, our 7B and 14B models outperform substantially larger proprietary systems by **5%** on average, underscoring the effectiveness of interactive, agentic workflows for robust Text-to-SQL generation.