Abstract
Text-to-SQL parsers are crucial in enabling non-experts to effortlessly query relational data. Training such parsers, by contrast, generally requires expertise in annotating natural language (NL) utterances with corresponding SQL queries. In this work, we propose a weak supervision approach for training text-to-SQL parsers. We take advantage of the recently proposed question meaning representation called QDMR, an intermediate between NL and formal query languages. Given questions, their QDMR structures (annotated by non-experts or automatically predicted), and the answers, we are able to automatically synthesize SQL queries that are used to train text-to-SQL models. We test our approach by experimenting on five benchmark datasets. Our results show that the weakly supervised models perform competitively with those trained on annotated NL-SQL data. Overall, we effectively train text-to-SQL parsers, while using zero SQL annotations.- Anthology ID:
- 2022.findings-naacl.193
- Volume:
- Findings of the Association for Computational Linguistics: NAACL 2022
- Month:
- July
- Year:
- 2022
- Address:
- Seattle, United States
- Editors:
- Marine Carpuat, Marie-Catherine de Marneffe, Ivan Vladimir Meza Ruiz
- Venue:
- Findings
- SIG:
- Publisher:
- Association for Computational Linguistics
- Note:
- Pages:
- 2528–2542
- Language:
- URL:
- https://aclanthology.org/2022.findings-naacl.193
- DOI:
- 10.18653/v1/2022.findings-naacl.193
- Cite (ACL):
- Tomer Wolfson, Daniel Deutch, and Jonathan Berant. 2022. Weakly Supervised Text-to-SQL Parsing through Question Decomposition. In Findings of the Association for Computational Linguistics: NAACL 2022, pages 2528–2542, Seattle, United States. Association for Computational Linguistics.
- Cite (Informal):
- Weakly Supervised Text-to-SQL Parsing through Question Decomposition (Wolfson et al., Findings 2022)
- PDF:
- https://preview.aclanthology.org/nschneid-patch-4/2022.findings-naacl.193.pdf
- Code
- tomerwolgithub/question-decomposition-to-sql