Dipit Saha

2026

ASTraNet at SemEval-2026 Task 13: Not All Code Looks the Same: Multi-View Structural and Semantic Detection of Machine-Generated Code
Ruwad Naswan | Dipit Saha | Md. Kabir | Nabiha Tahseen
Proceedings of the 20th International Workshop on Semantic Evaluation (2026)

The growing adoption of large language models for code generation poses challenges for code quality, security, and authorship verification—particularly when test conditions involve unseen programming languages, generators, or application domains. We present our system, which combines three code-pretrained transformer encoders (CodeT5p-220M, CodeBERT, UniXcoder) with a structure-first Flow-Augmented AST (FA-AST) encoder implemented as a Gated Graph Neural Network. On Subtask A our best single model achieves macro F1 of 0.559; a post-competition layered rank-fusion ensemble across all three encoders raises this to 0.643. On Subtask C we obtain 0.585 officially; a three-stage ensemble combining neural probabilities with LightGBM-based features and class-priority routing raises this to 0.652. Our contributions include a language-agnostic structural detector, a diversity-driven rank-fusion strategy exploiting low inter-model correlation for binary classification, and a meta-learner stacking pipeline for multi-class detection under distribution shift.

Co-authors

Venues

SemEval1
WS1

Fix author