DuoRAT: Towards Simpler Text-to-SQL Models

Torsten Scholak, Raymond Li, Dzmitry Bahdanau, Harm de Vries, Chris Pal


Abstract
Recent neural text-to-SQL models can effectively translate natural language questions to corresponding SQL queries on unseen databases. Working mostly on the Spider dataset, researchers have proposed increasingly sophisticated solutions to the problem. Contrary to this trend, in this paper we focus on simplifications. We begin by building DuoRAT, a re-implementation of the state-of-the-art RAT-SQL model that unlike RAT-SQL is using only relation-aware or vanilla transformers as the building blocks. We perform several ablation experiments using DuoRAT as the baseline model. Our experiments confirm the usefulness of some techniques and point out the redundancy of others, including structural SQL features and features that link the question with the schema.
Anthology ID:
2021.naacl-main.103
Volume:
Proceedings of the 2021 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies
Month:
June
Year:
2021
Address:
Online
Venue:
NAACL
SIG:
Publisher:
Association for Computational Linguistics
Note:
Pages:
1313–1321
Language:
URL:
https://aclanthology.org/2021.naacl-main.103
DOI:
10.18653/v1/2021.naacl-main.103
Bibkey:
Cite (ACL):
Torsten Scholak, Raymond Li, Dzmitry Bahdanau, Harm de Vries, and Chris Pal. 2021. DuoRAT: Towards Simpler Text-to-SQL Models. In Proceedings of the 2021 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, pages 1313–1321, Online. Association for Computational Linguistics.
Cite (Informal):
DuoRAT: Towards Simpler Text-to-SQL Models (Scholak et al., NAACL 2021)
Copy Citation:
PDF:
https://preview.aclanthology.org/ingestion-script-update/2021.naacl-main.103.pdf
Video:
 https://preview.aclanthology.org/ingestion-script-update/2021.naacl-main.103.mp4
Code
 ElementAI/duorat