MINDS at SemEval-2026-Task 13: Robust Detection of Machine-Generated Code under Distribution Shift

Giorgia Rosalia Buccelli, Antonella Coviello, Alexandra Elena Holota, Marco Scaglione, Simone Scalora, Claudio Savelli, Riccardo Coppola, Flavio Giobergia


Abstract
The growing use of large language models for code generation makes distinguishing machine-generated code from human-written code increasingly difficult, especially under distribution shifts in language, domain, and generator family. SemEval-2026 Task 13 targets this challenge through three subtasks: binary detection, multi-class authorship attribution, and hybrid/adversarial code detection.In this paper, we conduct an empirical study across all subtasks, comparing a variety of approaches: frozen encoder representations, feature-based classifiers, fine-tuned transformer models, post-hoc calibration, and probability-level ensembling. Our results show a consistent generalisation gap: strong in-domain validation scores substantially overestimate performance on shifted test conditions.The code is available at https://github.com/AlexandraElena-Holota/SemEval-2026-Task13.git
Anthology ID:
2026.semeval-1.386
Volume:
Proceedings of the 20th International Workshop on Semantic Evaluation (2026)
Month:
July
Year:
2026
Address:
San Diego, California, USA
Editors:
Ekaterina Kochmar, Debanjan Ghosh, Kai North, Mamoru Komachi
Venues:
SemEval | WS
SIG:
Publisher:
Association for Computational Linguistics
Note:
Pages:
3080–3088
Language:
URL:
https://preview.aclanthology.org/ingest-acl-workshops/2026.semeval-1.386/
DOI:
Bibkey:
Cite (ACL):
Giorgia Rosalia Buccelli, Antonella Coviello, Alexandra Elena Holota, Marco Scaglione, Simone Scalora, Claudio Savelli, Riccardo Coppola, and Flavio Giobergia. 2026. MINDS at SemEval-2026-Task 13: Robust Detection of Machine-Generated Code under Distribution Shift. In Proceedings of the 20th International Workshop on Semantic Evaluation (2026), pages 3080–3088, San Diego, California, USA. Association for Computational Linguistics.
Cite (Informal):
MINDS at SemEval-2026-Task 13: Robust Detection of Machine-Generated Code under Distribution Shift (Buccelli et al., SemEval 2026)
Copy Citation:
PDF:
https://preview.aclanthology.org/ingest-acl-workshops/2026.semeval-1.386.pdf