SemEval-2026 Task 13: Detecting Machine-Generated Code with Multiple Programming Languages, Generators, and Application Scenarios
Daniil Orel, Dilshod Azizov, Indraneil Paul, Yuxia Wang, Iryna Gurevych, Preslav Nakov
Abstract
We present the results and the main findings of SemEval-2026 Task 13: Detecting Machine-Generated Code with Multiple Programming Languages, Generators, and Application Scenarios. Our task featured three subtasks. Subtask A is a binary classification taskthat determines whether a given code snippet is written by a human or generated by a machine. This subtask focuses on the development of robust methods for AI-generated code identification, since the training and the test data splits have code in different languages and cover diverse usage domains. Subtask B focuses on defining synthetic code smells and requires participants to identify the provenance of the generator family of the model that generated the given code snippet. Subtask C aims at more fine-grained attribution of the written code: whether it was fully AI-generated, fully human-written, produced in human-AI collaboration (hybrid) or by a model tuned or prompted to give human-like code. The task attracted a large number of team members: subtask A (81), subtask B (34), and subtask C (32). In this study, we present the task, analyze the results and discuss the submissions of the system and the methods they used.- Anthology ID:
- 2026.semeval-1.445
- Volume:
- Proceedings of the 20th International Workshop on Semantic Evaluation (2026)
- Month:
- July
- Year:
- 2026
- Address:
- San Diego, California, USA
- Editors:
- Ekaterina Kochmar, Debanjan Ghosh, Kai North, Mamoru Komachi
- Venues:
- SemEval | WS
- SIG:
- Publisher:
- Association for Computational Linguistics
- Note:
- Pages:
- 3640–3658
- Language:
- URL:
- https://preview.aclanthology.org/ingest-acl-workshops/2026.semeval-1.445/
- DOI:
- Cite (ACL):
- Daniil Orel, Dilshod Azizov, Indraneil Paul, Yuxia Wang, Iryna Gurevych, and Preslav Nakov. 2026. SemEval-2026 Task 13: Detecting Machine-Generated Code with Multiple Programming Languages, Generators, and Application Scenarios. In Proceedings of the 20th International Workshop on Semantic Evaluation (2026), pages 3640–3658, San Diego, California, USA. Association for Computational Linguistics.
- Cite (Informal):
- SemEval-2026 Task 13: Detecting Machine-Generated Code with Multiple Programming Languages, Generators, and Application Scenarios (Orel et al., SemEval 2026)
- PDF:
- https://preview.aclanthology.org/ingest-acl-workshops/2026.semeval-1.445.pdf