@inproceedings{pourreza-rafiei-2024-dts,
    title = "{DTS}-{SQL}: Decomposed Text-to-{SQL} with Small Large Language Models",
    author = "Pourreza, Mohammadreza  and
      Rafiei, Davood",
    editor = "Al-Onaizan, Yaser  and
      Bansal, Mohit  and
      Chen, Yun-Nung",
    booktitle = "Findings of the Association for Computational Linguistics: EMNLP 2024",
    month = nov,
    year = "2024",
    address = "Miami, Florida, USA",
    publisher = "Association for Computational Linguistics",
    url = "https://preview.aclanthology.org/ingest-emnlp/2024.findings-emnlp.481/",
    doi = "10.18653/v1/2024.findings-emnlp.481",
    pages = "8212--8220",
    abstract = "Leading models for the text-to-SQL task heavily rely on proprietary Large Language Models (LLMs), posing concerns over data privacy. Closing the performance gap between small open-source models and large proprietary models is crucial to mitigate this reliance. To this end, we introduce a novel two-stage fine-tuning approach that decomposes the task into two simpler tasks. Through comprehensive evaluation on three large cross-domain datasets and two small LLMs, we show that this approach improves execution accuracy by 3 to 7 percent, effectively aligning the performance of open-source models with their proprietary counterparts. Our proposed method has achieved 60.31{\%} execution accuracy on Bird hold-out test set, which is the highest performance among methods using 7B parameter models."
}Markdown (Informal)
[DTS-SQL: Decomposed Text-to-SQL with Small Large Language Models](https://preview.aclanthology.org/ingest-emnlp/2024.findings-emnlp.481/) (Pourreza & Rafiei, Findings 2024)
ACL