Abstract
Translating natural language utterances to executable queries is a helpful technique in making the vast amount of data stored in relational databases accessible to a wider range of non-tech-savvy end users. Prior work in this area has largely focused on textual input that is linguistically correct and semantically unambiguous. However, real-world user queries are often succinct, colloquial, and noisy, resembling the input of a search engine. In this work, we introduce data augmentation techniques and a sampling-based content-aware BERT model (ColloQL) to achieve robust text-to-SQL modeling over natural language search (NLS) questions. Due to the lack of evaluation data, we curate a new dataset of NLS questions and demonstrate the efficacy of our approach. ColloQL’s superior performance extends to well-formed text, achieving an 84.9% (logical) and 90.7% (execution) accuracy on the WikiSQL dataset, making it, to the best of our knowledge, the highest performing model that does not use execution guided decoding.- Anthology ID:
- 2020.intexsempar-1.5
- Volume:
- Proceedings of the First Workshop on Interactive and Executable Semantic Parsing
- Month:
- November
- Year:
- 2020
- Address:
- Online
- Editors:
- Ben Bogin, Srinivasan Iyer, Victoria Lin, Dragomir Radev, Alane Suhr, Panupong, Caiming Xiong, Pengcheng Yin, Tao Yu, Rui Zhang, Victor Zhong
- Venue:
- intexsempar
- SIG:
- Publisher:
- Association for Computational Linguistics
- Note:
- Pages:
- 34–45
- Language:
- URL:
- https://aclanthology.org/2020.intexsempar-1.5
- DOI:
- 10.18653/v1/2020.intexsempar-1.5
- Cite (ACL):
- Karthik Radhakrishnan, Arvind Srikantan, and Xi Victoria Lin. 2020. ColloQL: Robust Text-to-SQL Over Search Queries. In Proceedings of the First Workshop on Interactive and Executable Semantic Parsing, pages 34–45, Online. Association for Computational Linguistics.
- Cite (Informal):
- ColloQL: Robust Text-to-SQL Over Search Queries (Radhakrishnan et al., intexsempar 2020)
- PDF:
- https://preview.aclanthology.org/ingest-acl-2023-videos/2020.intexsempar-1.5.pdf
- Code
- karthikradhakrishnan96/ColloQL
- Data
- WikiSQL