NoviCode: Generating Programs from Natural Language Utterances by Novices

Asaf Achi Mordechai, Yoav Goldberg, Reut Tsarfaty


Abstract
Current Text-to-Code models demonstrate impressive capabilities in generating executable code from natural language snippets. However, current studies focus on technical instructions and programmer-oriented language, and it is an open question whether these models can effectively translate natural language descriptions given by non-technical users and express complex goals, to an executable program that contains an intricate flow—composed of API access and control structures as loops, conditions, and sequences. To unlock the challenge of generating a complete program from a plain non-technical description we present NoviCode, a novel NL Programming task, which takes as input an API and a natural language description by a novice non-programmer, and provides an executable program as output. To assess the efficacy of models on this task, we provide a novel benchmark accompanied by test suites wherein the generated program code is assessed not according to their form, but according to their functional execution. Our experiments show that, first, NoviCode is indeed a challenging task in the code synthesis domain, and that generating complex code from non-technical instructions goes beyond the current Text-to-Code paradigm. Second, we show that a novel approach wherein we align the NL utterances with the compositional hierarchical structure of the code, greatly enhances the performance of LLMs on this task, compared with the end-to-end Text-to-Code counterparts.
Anthology ID:
2024.tacl-1.73
Volume:
Transactions of the Association for Computational Linguistics, Volume 12
Month:
Year:
2024
Address:
Cambridge, MA
Venue:
TACL
SIG:
Publisher:
MIT Press
Note:
Pages:
1330–1345
Language:
URL:
https://preview.aclanthology.org/fix-sig-urls/2024.tacl-1.73/
DOI:
10.1162/tacl_a_00694
Bibkey:
Cite (ACL):
Asaf Achi Mordechai, Yoav Goldberg, and Reut Tsarfaty. 2024. NoviCode: Generating Programs from Natural Language Utterances by Novices. Transactions of the Association for Computational Linguistics, 12:1330–1345.
Cite (Informal):
NoviCode: Generating Programs from Natural Language Utterances by Novices (Mordechai et al., TACL 2024)
Copy Citation:
PDF:
https://preview.aclanthology.org/fix-sig-urls/2024.tacl-1.73.pdf