@inproceedings{villmow-etal-2021-contest,
    title = "{C}on{T}est: A Unit Test Completion Benchmark featuring Context",
    author = "Villmow, Johannes  and
      Depoix, Jonas  and
      Ulges, Adrian",
    editor = "Lachmy, Royi  and
      Yao, Ziyu  and
      Durrett, Greg  and
      Gligoric, Milos  and
      Li, Junyi Jessy  and
      Mooney, Ray  and
      Neubig, Graham  and
      Su, Yu  and
      Sun, Huan  and
      Tsarfaty, Reut",
    booktitle = "Proceedings of the 1st Workshop on Natural Language Processing for Programming (NLP4Prog 2021)",
    month = aug,
    year = "2021",
    address = "Online",
    publisher = "Association for Computational Linguistics",
    url = "https://preview.aclanthology.org/ingest-emnlp/2021.nlp4prog-1.2/",
    doi = "10.18653/v1/2021.nlp4prog-1.2",
    pages = "17--25",
    abstract = "We introduce CONTEST, a benchmark for NLP-based unit test completion, the task of predicting a test{'}s assert statements given its setup and focal method, i.e. the method to be tested. ConTest is large-scale (with 365k datapoints). Besides the test code and tested code, it also features context code called by either. We found context to be crucial for accurately predicting assertions. We also introduce baselines based on transformer encoder-decoders, and study the effects of including syntactic information and context. Overall, our models achieve a BLEU score of 38.2, while only generating unparsable code in 1.92{\%} of cases."
}Markdown (Informal)
[ConTest: A Unit Test Completion Benchmark featuring Context](https://preview.aclanthology.org/ingest-emnlp/2021.nlp4prog-1.2/) (Villmow et al., NLP4Prog 2021)
ACL