SemEval-2026 Task 13: Detecting Machine-Generated Code with Multiple Programming Languages, Generators, and Application Scenarios

Daniil Orel; Dilshod Azizov; Indraneil Paul; Yuxia Wang; Iryna Gurevych; Preslav Nakov

SemEval-2026 Task 13: Detecting Machine-Generated Code with Multiple Programming Languages, Generators, and Application Scenarios

Daniil Orel, Dilshod Azizov, Indraneil Paul, Yuxia Wang, Iryna Gurevych, Preslav Nakov

Abstract

We present the results and the main findings of SemEval-2026 Task 13: Detecting Machine-Generated Code with Multiple Programming Languages, Generators, and Application Scenarios. Our task featured three subtasks. Subtask A is a binary classification taskthat determines whether a given code snippet is written by a human or generated by a machine. This subtask focuses on the development of robust methods for AI-generated code identification, since the training and the test data splits have code in different languages and cover diverse usage domains. Subtask B focuses on defining synthetic code smells and requires participants to identify the provenance of the generator family of the model that generated the given code snippet. Subtask C aims at more fine-grained attribution of the written code: whether it was fully AI-generated, fully human-written, produced in human-AI collaboration (hybrid) or by a model tuned or prompted to give human-like code. The task attracted a large number of team members: subtask A (81), subtask B (34), and subtask C (32). In this study, we present the task, analyze the results and discuss the submissions of the system and the methods they used.

Anthology ID:: 2026.semeval-1.445
Volume:: Proceedings of the 20th International Workshop on Semantic Evaluation (2026)
Month:: July
Year:: 2026
Address:: San Diego, California, USA
Editors:: Ekaterina Kochmar, Debanjan Ghosh, Kai North, Mamoru Komachi
Venues:: SemEval | WS
SIG:
Publisher:: Association for Computational Linguistics
Note:
Pages:: 3640–3658
Language:
URL:: https://preview.aclanthology.org/ingest-acl-workshops/2026.semeval-1.445/
DOI:
Bibkey:
Cite (ACL):: Daniil Orel, Dilshod Azizov, Indraneil Paul, Yuxia Wang, Iryna Gurevych, and Preslav Nakov. 2026. SemEval-2026 Task 13: Detecting Machine-Generated Code with Multiple Programming Languages, Generators, and Application Scenarios. In Proceedings of the 20th International Workshop on Semantic Evaluation (2026), pages 3640–3658, San Diego, California, USA. Association for Computational Linguistics.
Cite (Informal):: SemEval-2026 Task 13: Detecting Machine-Generated Code with Multiple Programming Languages, Generators, and Application Scenarios (Orel et al., SemEval 2026)
Copy Citation:
PDF:: https://preview.aclanthology.org/ingest-acl-workshops/2026.semeval-1.445.pdf

PDF Cite Search Fix data