Towards Generalizable and Robust Text-to-SQL Parsing
Chang Gao, Bowen Li, Wenxuan Zhang, Wai Lam, Binhua Li, Fei Huang, Luo Si, Yongbin Li
Abstract
Text-to-SQL parsing tackles the problem of mapping natural language questions to executable SQL queries. In practice, text-to-SQL parsers often encounter various challenging scenarios, requiring them to be generalizable and robust. While most existing work addresses a particular generalization or robustness challenge, we aim to study it in a more comprehensive manner. In specific, we believe that text-to-SQL parsers should be (1) generalizable at three levels of generalization, namely i.i.d., zero-shot, and compositional, and (2) robust against input perturbations. To enhance these capabilities of the parser, we propose a novel TKK framework consisting of Task decomposition, Knowledge acquisition, and Knowledge composition to learn text-to-SQL parsing in stages. By dividing the learning process into multiple stages, our framework improves the parser’s ability to acquire general SQL knowledge instead of capturing spurious patterns, making it more generalizable and robust. Experimental results under various generalization and robustness settings show that our framework is effective in all scenarios and achieves state-of-the-art performance on the Spider, SParC, and CoSQL datasets.- Anthology ID:
- 2022.findings-emnlp.155
- Volume:
- Findings of the Association for Computational Linguistics: EMNLP 2022
- Month:
- December
- Year:
- 2022
- Address:
- Abu Dhabi, United Arab Emirates
- Editors:
- Yoav Goldberg, Zornitsa Kozareva, Yue Zhang
- Venue:
- Findings
- SIG:
- Publisher:
- Association for Computational Linguistics
- Note:
- Pages:
- 2113–2125
- Language:
- URL:
- https://aclanthology.org/2022.findings-emnlp.155
- DOI:
- 10.18653/v1/2022.findings-emnlp.155
- Cite (ACL):
- Chang Gao, Bowen Li, Wenxuan Zhang, Wai Lam, Binhua Li, Fei Huang, Luo Si, and Yongbin Li. 2022. Towards Generalizable and Robust Text-to-SQL Parsing. In Findings of the Association for Computational Linguistics: EMNLP 2022, pages 2113–2125, Abu Dhabi, United Arab Emirates. Association for Computational Linguistics.
- Cite (Informal):
- Towards Generalizable and Robust Text-to-SQL Parsing (Gao et al., Findings 2022)
- PDF:
- https://preview.aclanthology.org/proper-vol2-ingestion/2022.findings-emnlp.155.pdf