Bitzkrieg at SemEval-2026 Task 13: Calibration-Aware Dual CodeBERT for Multilingual Machine-Generated Code Detection
Thenmozhi D., Adithya S, Harshil Malisetty, Aadit P, Rohan R
Abstract
We describe our submission to SemEval-2026 Task 13, addressing binary detection (Subtask A), generator attribution (Subtask B), and hybrid/adversarial authorship classification (Subtask C) of machine-generated code (MGC). For Subtask A, we fine-tune two CodeBERT models with complementary sampling strategies and apply percentile-based post-hoc calibration, improving Macro-F1 from 0.47 to 0.56 without additional training. For Subtask B, we combine TF-IDF n-grams, frozen CodeBERT embeddings, and language features with XGBoost, using synthetic augmentation and class weighting to handle an 11-class dataset skewed 88% toward the human class, achieving Macro-F1 of 0.289. For Subtask C, we fine-tune a CodeBERT classifier for four-way authorship classification, achieving Macro-F1 of 0.49. Our results highlight the importance of probability calibration for binary detection and class balancing for multi-class attribution.- Anthology ID:
- 2026.semeval-1.282
- Volume:
- Proceedings of the 20th International Workshop on Semantic Evaluation (2026)
- Month:
- July
- Year:
- 2026
- Address:
- San Diego, California, USA
- Editors:
- Ekaterina Kochmar, Debanjan Ghosh, Kai North, Mamoru Komachi
- Venues:
- SemEval | WS
- SIG:
- Publisher:
- Association for Computational Linguistics
- Note:
- Pages:
- 2233–2237
- Language:
- URL:
- https://preview.aclanthology.org/ingest-acl-workshops/2026.semeval-1.282/
- DOI:
- Cite (ACL):
- Thenmozhi D., Adithya S, Harshil Malisetty, Aadit P, and Rohan R. 2026. Bitzkrieg at SemEval-2026 Task 13: Calibration-Aware Dual CodeBERT for Multilingual Machine-Generated Code Detection. In Proceedings of the 20th International Workshop on Semantic Evaluation (2026), pages 2233–2237, San Diego, California, USA. Association for Computational Linguistics.
- Cite (Informal):
- Bitzkrieg at SemEval-2026 Task 13: Calibration-Aware Dual CodeBERT for Multilingual Machine-Generated Code Detection (D. et al., SemEval 2026)
- PDF:
- https://preview.aclanthology.org/ingest-acl-workshops/2026.semeval-1.282.pdf