Farhan Nafis Rayhan at SemEval-2026 Task 13: Supervised Contrastive Learning Approach with Gated Multiclass Decomposition Ensemble Architecture for Code Authorship Identification

Farhan Rayhan, Fariska Ruskanda


Abstract
This paper present our submission for SemEval-2026 Task 13 Subtask B, which requires the multi-class attribution of code snippets across 10 distinct AI generator families and a human baseline. Our proposed system utilizes a three-stage ensemble architecture specifically designed to navigate extreme class imbalance and capture subtle stylometric fingerprints. Initially, we employ Supervised Contrastive Learning to fine-tune a UniXcoder and ModernBERT backbone. Resulting embeddings are then processed by five heterogeneous shallow experts, each utilizing a multiclass decomposition to master specific generator lineages through specialized architectures. A Human Shield acts as a hierarchical safety auditor as an aggressive binary layer of human vs machine. Finally, a Context-Aware Gated Meta-Learner dynamically aggregates these expert opinions into a final predictions. Our experiments reveal that streamlining the system to a pure UniXcoder backbone fine-tuned with supervised contrastive learning improves performance, outclassing the official CodeBERT baseline with a final Macro-F1 score of 0.31389, ranking 26th overall.
Anthology ID:
2026.semeval-1.431
Volume:
Proceedings of the 20th International Workshop on Semantic Evaluation (2026)
Month:
July
Year:
2026
Address:
San Diego, California, USA
Editors:
Ekaterina Kochmar, Debanjan Ghosh, Kai North, Mamoru Komachi
Venues:
SemEval | WS
SIG:
Publisher:
Association for Computational Linguistics
Note:
Pages:
3485–3494
Language:
URL:
https://preview.aclanthology.org/ingest-acl-workshops/2026.semeval-1.431/
DOI:
Bibkey:
Cite (ACL):
Farhan Rayhan and Fariska Ruskanda. 2026. Farhan Nafis Rayhan at SemEval-2026 Task 13: Supervised Contrastive Learning Approach with Gated Multiclass Decomposition Ensemble Architecture for Code Authorship Identification. In Proceedings of the 20th International Workshop on Semantic Evaluation (2026), pages 3485–3494, San Diego, California, USA. Association for Computational Linguistics.
Cite (Informal):
Farhan Nafis Rayhan at SemEval-2026 Task 13: Supervised Contrastive Learning Approach with Gated Multiclass Decomposition Ensemble Architecture for Code Authorship Identification (Rayhan & Ruskanda, SemEval 2026)
Copy Citation:
PDF:
https://preview.aclanthology.org/ingest-acl-workshops/2026.semeval-1.431.pdf