Non-Programmers Can Label Programs Indirectly via Active Examples: A Case Study with Text-to-SQL

Ruiqi Zhong; Charlie Snell; Dan Klein; Jason Eisner

doi:10.18653/v1/2023.emnlp-main.312

Non-Programmers Can Label Programs Indirectly via Active Examples: A Case Study with Text-to-SQL

Ruiqi Zhong, Charlie Snell, Dan Klein, Jason Eisner

Abstract

Can non-programmers annotate natural language utterances with complex programs that represent their meaning? We introduce APEL, a framework in which non-programmers select among candidate programs generated by a seed semantic parser (e.g., Codex). Since they cannot understand the candidate programs, we ask them to select indirectly by examining the programs’ input-ouput examples. For each utterance, APEL actively searches for a simple input on which the candidate programs tend to produce different outputs. It then asks the non-programmers only to choose the appropriate output, thus allowing us to infer which program is correct and could be used to fine-tune the parser. As a first case study, we recruited human non-programmers to use APEL to re-annotate SPIDER, a text-to-SQL dataset. Our approach achieved the same annotation accuracy as the original expert annotators (75%) and exposed many subtle errors in the original annotations.

Anthology ID:: 2023.emnlp-main.312
Volume:: Proceedings of the 2023 Conference on Empirical Methods in Natural Language Processing
Month:: December
Year:: 2023
Address:: Singapore
Editors:: Houda Bouamor, Juan Pino, Kalika Bali
Venue:: EMNLP
SIG:
Publisher:: Association for Computational Linguistics
Note:
Pages:: 5126–5152
Language:
URL:: https://preview.aclanthology.org/Author-page-Marten-During-lu/2023.emnlp-main.312/
DOI:: 10.18653/v1/2023.emnlp-main.312
Bibkey:
Cite (ACL):: Ruiqi Zhong, Charlie Snell, Dan Klein, and Jason Eisner. 2023. Non-Programmers Can Label Programs Indirectly via Active Examples: A Case Study with Text-to-SQL. In Proceedings of the 2023 Conference on Empirical Methods in Natural Language Processing, pages 5126–5152, Singapore. Association for Computational Linguistics.
Cite (Informal):: Non-Programmers Can Label Programs Indirectly via Active Examples: A Case Study with Text-to-SQL (Zhong et al., EMNLP 2023)
Copy Citation:
PDF:: https://preview.aclanthology.org/Author-page-Marten-During-lu/2023.emnlp-main.312.pdf
Video:: https://preview.aclanthology.org/Author-page-Marten-During-lu/2023.emnlp-main.312.mp4

PDF Cite Search Video Fix data