Azher Ali
2026
NUST CodeIntel at SemEval-2026 Task 13: Cross-Domain Detection of Machine-Generated Code via Stylometric Features and Transformer Models
Azher Ali | Mehwish Fatima
Proceedings of the 20th International Workshop on Semantic Evaluation (2026)
Azher Ali | Mehwish Fatima
Proceedings of the 20th International Workshop on Semantic Evaluation (2026)
We present our submission to SemEval-2026 Task 13 on cross-language and cross-domain detection of machine-generated code. We compare TF-IDF-based models with stylometric features against LoRA-tuned transformer encoders. While transformers achieve near-perfect in-distribution performance, they degrade sharply on unseen languages and domains. In contrast, a TF-IDF + Logistic Regression model attains the best test Macro-F1 and shows greater robustness. These results highlight the limitations of neural models under distribution shift and the strength of lexical baselines for cross-domain generalization.