Sangam Sai Anish

2026

CodeDet-NITS at SemEval-2026 Task 13: AI Code Authorship Detection Beyond Truncation
Lekkala Sai Teja | Annepaka Yadagiri | Kshitij Patiyal | Sangam Sai Anish | Partha Pakray
Proceedings of the 20th International Workshop on Semantic Evaluation (2026)

Automatically determining whether source code is human written or produced by a specific family of large language models is becoming essential for reliable assessment, provenance tracking, and dataset curation. We present a lightweight yet competitive system for SemEval 2026 Task 13 Subtask B, which requires attributing each snippet to one of eleven classes: human or one of ten LLM families. Our method repurposes code oriented instruction tuned backbones from the Qwen2.5 Coder series as sequence classifiers and adapts them using QLoRA, combining frozen low precision weights with low rank trainable adapters to reduce memory and compute overhead. The core design choice addresses long snippets without losing evidence. Instead of truncating to a fixed context, we apply an overlapping sliding window strategy that expands long examples into multiple fixed length windows during training, all sharing the same label. For validation and test, windows are generated on the fly and their evidence is aggregated by averaging logits to yield a single prediction per snippet, enabling token complete use of the input while keeping inference stable. Our final submission ranked 8th on the official Subtask B test set leaderboard.

Co-authors

Venues

SemEval1
WS1

Fix author