Xiaohu Zhu


Fixing paper assignments

  1. Please select all papers that belong to the same person.
  2. Indicate below which author they should be assigned to.
Provide a valid ORCID iD here. This will be used to match future papers to this author.
Provide the name of the school or the university where the author has received or will receive their highest degree (e.g., Ph.D. institution for researchers, or current affiliation for students). This will be used to form the new author page ID, if needed.

TODO: "submit" and "cancel" buttons here


2025

pdf bib
Learning SQL Like a Human: Structure-Aware Curriculum Learning for Text-to-SQL Generation
Xiaohu Zhu | Qian Li | Lizhen Cui | Yuntao Du
Findings of the Association for Computational Linguistics: EMNLP 2025

The Text-to-SQL capabilities of large language allow users to interact with databases using natural language. While current models struggle with handling complex queries, especially involving multi-table joins and reasoning. To address this gap, we propose to construct a model, namely SAC-SQL, with synthetic training samples followed by a structure-aware curriculum learning framework for enhancing SQL generation. Our approach begins with a supervised fine-tuning (SFT) stage, where we train open-source models on a synthetically constructed, cross-domain SQL dataset with diverse structural patterns. Moreover, we introduce a unified structure difficulty scoring function to partition the training samples into non-overlapping curriculum phases, guiding the model progressively learning from simpler to more complex SQL structures. Extensive experiments are conducted and the results show that SAC-SQL achieves better results than the baselines, and significantly narrows the performance gap between open-source and close-source models on Spider and Bird benchmarks.