Antonella Coviello
2026
MINDS at SemEval-2026-Task 13: Robust Detection of Machine-Generated Code under Distribution Shift
Giorgia Rosalia Buccelli | Antonella Coviello | Alexandra Elena Holota | Marco Scaglione | Simone Scalora | Claudio Savelli | Riccardo Coppola | Flavio Giobergia
Proceedings of the 20th International Workshop on Semantic Evaluation (2026)
Giorgia Rosalia Buccelli | Antonella Coviello | Alexandra Elena Holota | Marco Scaglione | Simone Scalora | Claudio Savelli | Riccardo Coppola | Flavio Giobergia
Proceedings of the 20th International Workshop on Semantic Evaluation (2026)
The growing use of large language models for code generation makes distinguishing machine-generated code from human-written code increasingly difficult, especially under distribution shifts in language, domain, and generator family. SemEval-2026 Task 13 targets this challenge through three subtasks: binary detection, multi-class authorship attribution, and hybrid/adversarial code detection.In this paper, we conduct an empirical study across all subtasks, comparing a variety of approaches: frozen encoder representations, feature-based classifiers, fine-tuned transformer models, post-hoc calibration, and probability-level ensembling. Our results show a consistent generalisation gap: strong in-domain validation scores substantially overestimate performance on shifted test conditions.The code is available at https://github.com/AlexandraElena-Holota/SemEval-2026-Task13.git