Sangam Sai Anish


2026

Automatically determining whether source code is human written or produced by a specific family of large language models is becoming essential for reliable assessment, provenance tracking, and dataset curation. We present a lightweight yet competitive system for SemEval 2026 Task 13 Subtask B, which requires attributing each snippet to one of eleven classes: human or one of ten LLM families. Our method repurposes code oriented instruction tuned backbones from the Qwen2.5 Coder series as sequence classifiers and adapts them using QLoRA, combining frozen low precision weights with low rank trainable adapters to reduce memory and compute overhead. The core design choice addresses long snippets without losing evidence. Instead of truncating to a fixed context, we apply an overlapping sliding window strategy that expands long examples into multiple fixed length windows during training, all sharing the same label. For validation and test, windows are generated on the fly and their evidence is aggregated by averaging logits to yield a single prediction per snippet, enabling token complete use of the input while keeping inference stable. Our final submission ranked 8th on the official Subtask B test set leaderboard.