CodeDet-NITS at SemEval-2026 Task 13: AI Code Authorship Detection Beyond Truncation

L D M S Sai Teja; Annepaka Yadagiri; Kshitij Patiyal; Sangam Sai Anish; Partha Pakray

CodeDet-NITS at SemEval-2026 Task 13: AI Code Authorship Detection Beyond Truncation

Lekkala Sai Teja, Annepaka Yadagiri, Kshitij Patiyal, Sangam Sai Anish, Partha Pakray

Abstract

Automatically determining whether source code is human written or produced by a specific family of large language models is becoming essential for reliable assessment, provenance tracking, and dataset curation. We present a lightweight yet competitive system for SemEval 2026 Task 13 Subtask B, which requires attributing each snippet to one of eleven classes: human or one of ten LLM families. Our method repurposes code oriented instruction tuned backbones from the Qwen2.5 Coder series as sequence classifiers and adapts them using QLoRA, combining frozen low precision weights with low rank trainable adapters to reduce memory and compute overhead. The core design choice addresses long snippets without losing evidence. Instead of truncating to a fixed context, we apply an overlapping sliding window strategy that expands long examples into multiple fixed length windows during training, all sharing the same label. For validation and test, windows are generated on the fly and their evidence is aggregated by averaging logits to yield a single prediction per snippet, enabling token complete use of the input while keeping inference stable. Our final submission ranked 8th on the official Subtask B test set leaderboard.

Anthology ID:: 2026.semeval-1.397
Volume:: Proceedings of the 20th International Workshop on Semantic Evaluation (2026)
Month:: July
Year:: 2026
Address:: San Diego, California, USA
Editors:: Ekaterina Kochmar, Debanjan Ghosh, Kai North, Mamoru Komachi
Venues:: SemEval | WS
SIG:
Publisher:: Association for Computational Linguistics
Note:
Pages:: 3165–3168
Language:
URL:: https://preview.aclanthology.org/ingest-acl-workshops/2026.semeval-1.397/
DOI:
Bibkey:
Cite (ACL):: Lekkala Sai Teja, Annepaka Yadagiri, Kshitij Patiyal, Sangam Sai Anish, and Partha Pakray. 2026. CodeDet-NITS at SemEval-2026 Task 13: AI Code Authorship Detection Beyond Truncation. In Proceedings of the 20th International Workshop on Semantic Evaluation (2026), pages 3165–3168, San Diego, California, USA. Association for Computational Linguistics.
Cite (Informal):: CodeDet-NITS at SemEval-2026 Task 13: AI Code Authorship Detection Beyond Truncation (Sai Teja et al., SemEval 2026)
Copy Citation:
PDF:: https://preview.aclanthology.org/ingest-acl-workshops/2026.semeval-1.397.pdf
Supplementarymaterial:: 2026.semeval-1.397.SupplementaryMaterial.zip

PDF Cite Search Supplementarymaterial Fix data