Abstract
Can non-programmers annotate natural language utterances with complex programs that represent their meaning? We introduce APEL, a framework in which non-programmers select among candidate programs generated by a seed semantic parser (e.g., Codex). Since they cannot understand the candidate programs, we ask them to select indirectly by examining the programs’ input-ouput examples. For each utterance, APEL actively searches for a simple input on which the candidate programs tend to produce different outputs. It then asks the non-programmers only to choose the appropriate output, thus allowing us to infer which program is correct and could be used to fine-tune the parser. As a first case study, we recruited human non-programmers to use APEL to re-annotate SPIDER, a text-to-SQL dataset. Our approach achieved the same annotation accuracy as the original expert annotators (75%) and exposed many subtle errors in the original annotations.- Anthology ID:
- 2023.emnlp-main.312
- Volume:
- Proceedings of the 2023 Conference on Empirical Methods in Natural Language Processing
- Month:
- December
- Year:
- 2023
- Address:
- Singapore
- Editors:
- Houda Bouamor, Juan Pino, Kalika Bali
- Venue:
- EMNLP
- SIG:
- Publisher:
- Association for Computational Linguistics
- Note:
- Pages:
- 5126–5152
- Language:
- URL:
- https://aclanthology.org/2023.emnlp-main.312
- DOI:
- 10.18653/v1/2023.emnlp-main.312
- Cite (ACL):
- Ruiqi Zhong, Charlie Snell, Dan Klein, and Jason Eisner. 2023. Non-Programmers Can Label Programs Indirectly via Active Examples: A Case Study with Text-to-SQL. In Proceedings of the 2023 Conference on Empirical Methods in Natural Language Processing, pages 5126–5152, Singapore. Association for Computational Linguistics.
- Cite (Informal):
- Non-Programmers Can Label Programs Indirectly via Active Examples: A Case Study with Text-to-SQL (Zhong et al., EMNLP 2023)
- PDF:
- https://preview.aclanthology.org/landing_page/2023.emnlp-main.312.pdf