MINDS at SemEval-2026-Task 13: Robust Detection of Machine-Generated Code under Distribution Shift
Giorgia Rosalia Buccelli, Antonella Coviello, Alexandra Elena Holota, Marco Scaglione, Simone Scalora, Claudio Savelli, Riccardo Coppola, Flavio Giobergia
Abstract
The growing use of large language models for code generation makes distinguishing machine-generated code from human-written code increasingly difficult, especially under distribution shifts in language, domain, and generator family. SemEval-2026 Task 13 targets this challenge through three subtasks: binary detection, multi-class authorship attribution, and hybrid/adversarial code detection.In this paper, we conduct an empirical study across all subtasks, comparing a variety of approaches: frozen encoder representations, feature-based classifiers, fine-tuned transformer models, post-hoc calibration, and probability-level ensembling. Our results show a consistent generalisation gap: strong in-domain validation scores substantially overestimate performance on shifted test conditions.The code is available at https://github.com/AlexandraElena-Holota/SemEval-2026-Task13.git- Anthology ID:
- 2026.semeval-1.386
- Volume:
- Proceedings of the 20th International Workshop on Semantic Evaluation (2026)
- Month:
- July
- Year:
- 2026
- Address:
- San Diego, California, USA
- Editors:
- Ekaterina Kochmar, Debanjan Ghosh, Kai North, Mamoru Komachi
- Venues:
- SemEval | WS
- SIG:
- Publisher:
- Association for Computational Linguistics
- Note:
- Pages:
- 3080–3088
- Language:
- URL:
- https://preview.aclanthology.org/ingest-acl-workshops/2026.semeval-1.386/
- DOI:
- Cite (ACL):
- Giorgia Rosalia Buccelli, Antonella Coviello, Alexandra Elena Holota, Marco Scaglione, Simone Scalora, Claudio Savelli, Riccardo Coppola, and Flavio Giobergia. 2026. MINDS at SemEval-2026-Task 13: Robust Detection of Machine-Generated Code under Distribution Shift. In Proceedings of the 20th International Workshop on Semantic Evaluation (2026), pages 3080–3088, San Diego, California, USA. Association for Computational Linguistics.
- Cite (Informal):
- MINDS at SemEval-2026-Task 13: Robust Detection of Machine-Generated Code under Distribution Shift (Buccelli et al., SemEval 2026)
- PDF:
- https://preview.aclanthology.org/ingest-acl-workshops/2026.semeval-1.386.pdf