Natural Language Interface for Databases Using a Dual-Encoder Model

Ionel Alexandru Hosu, Radu Cristian Alexandru Iacob, Florin Brad, Stefan Ruseti, Traian Rebedea


Abstract
We propose a sketch-based two-step neural model for generating structured queries (SQL) based on a user’s request in natural language. The sketch is obtained by using placeholders for specific entities in the SQL query, such as column names, table names, aliases and variables, in a process similar to semantic parsing. The first step is to apply a sequence-to-sequence (SEQ2SEQ) model to determine the most probable SQL sketch based on the request in natural language. Then, a second network designed as a dual-encoder SEQ2SEQ model using both the text query and the previously obtained sketch is employed to generate the final SQL query. Our approach shows improvements over previous approaches on two recent large datasets (WikiSQL and SENLIDB) suitable for data-driven solutions for natural language interfaces for databases.
Anthology ID:
C18-1043
Volume:
Proceedings of the 27th International Conference on Computational Linguistics
Month:
August
Year:
2018
Address:
Santa Fe, New Mexico, USA
Editors:
Emily M. Bender, Leon Derczynski, Pierre Isabelle
Venue:
COLING
SIG:
Publisher:
Association for Computational Linguistics
Note:
Pages:
514–524
Language:
URL:
https://aclanthology.org/C18-1043
DOI:
Bibkey:
Cite (ACL):
Ionel Alexandru Hosu, Radu Cristian Alexandru Iacob, Florin Brad, Stefan Ruseti, and Traian Rebedea. 2018. Natural Language Interface for Databases Using a Dual-Encoder Model. In Proceedings of the 27th International Conference on Computational Linguistics, pages 514–524, Santa Fe, New Mexico, USA. Association for Computational Linguistics.
Cite (Informal):
Natural Language Interface for Databases Using a Dual-Encoder Model (Hosu et al., COLING 2018)
Copy Citation:
PDF:
https://preview.aclanthology.org/nschneid-patch-2/C18-1043.pdf
Data
WikiSQL