Bridging Textual and Tabular Data for Cross-Domain Text-to-SQL Semantic Parsing

Xi Victoria Lin, Richard Socher, Caiming Xiong


Abstract
We present BRIDGE, a powerful sequential architecture for modeling dependencies between natural language questions and relational databases in cross-DB semantic parsing. BRIDGE represents the question and DB schema in a tagged sequence where a subset of the fields are augmented with cell values mentioned in the question. The hybrid sequence is encoded by BERT with minimal subsequent layers and the text-DB contextualization is realized via the fine-tuned deep attention in BERT. Combined with a pointer-generator decoder with schema-consistency driven search space pruning, BRIDGE attained state-of-the-art performance on the well-studied Spider benchmark (65.5% dev, 59.2% test), despite being much simpler than most recently proposed models for this task. Our analysis shows that BRIDGE effectively captures the desired cross-modal dependencies and has the potential to generalize to more text-DB related tasks. Our model implementation is available at https://github.com/salesforce/TabularSemanticParsing.
Anthology ID:
2020.findings-emnlp.438
Volume:
Findings of the Association for Computational Linguistics: EMNLP 2020
Month:
November
Year:
2020
Address:
Online
Editors:
Trevor Cohn, Yulan He, Yang Liu
Venue:
Findings
SIG:
Publisher:
Association for Computational Linguistics
Note:
Pages:
4870–4888
Language:
URL:
https://aclanthology.org/2020.findings-emnlp.438
DOI:
10.18653/v1/2020.findings-emnlp.438
Bibkey:
Cite (ACL):
Xi Victoria Lin, Richard Socher, and Caiming Xiong. 2020. Bridging Textual and Tabular Data for Cross-Domain Text-to-SQL Semantic Parsing. In Findings of the Association for Computational Linguistics: EMNLP 2020, pages 4870–4888, Online. Association for Computational Linguistics.
Cite (Informal):
Bridging Textual and Tabular Data for Cross-Domain Text-to-SQL Semantic Parsing (Lin et al., Findings 2020)
Copy Citation:
PDF:
https://preview.aclanthology.org/nschneid-patch-4/2020.findings-emnlp.438.pdf
Code
 salesforce/TabularSemanticParsing +  additional community code
Data
WikiSQL