Janani Hariharakrishnan
2026
Pixel Phantoms at SemEval-2026 Task 13: Exploring Classical and Neural Approaches for AI-Generated Code Detection
Jithu Morrison S | Janani Hariharakrishnan | Angel Deborah S | Rajalakshmi S
Proceedings of the 20th International Workshop on Semantic Evaluation (2026)
Jithu Morrison S | Janani Hariharakrishnan | Angel Deborah S | Rajalakshmi S
Proceedings of the 20th International Workshop on Semantic Evaluation (2026)
This paper describes our system for SemEval-2026 Task 13, Subtask A: detecting whether a given code snippet is AI-generated or human-written. We explored a range of approaches from classical machine learning baselines using TF-IDF representations to fine-tuned transformer models pre-trained on code, specifically CodeBERT and GraphCodeBERT. Our experiments revealed a notable degradation in model performance when CodeBERT was trained beyond an optimal number of steps, indicating that continued training within an epoch leads to overfitting or representation drift. GraphCodeBERT, by contrast, yielded our best submission with a macro F1 score of 0.36866. Our findings highlight the sensitivity of code-specific transformers to training duration and suggest that early checkpoint selection is critical for this task.
2025
Pixel Phantoms at SemEval-2025 Task 11: Enhancing Multilingual Emotion Detection with a T5 and mT5-Based Approach
Jithu Morrison S | Janani Hariharakrishnan | Harsh Pratap Singh
Proceedings of the 19th International Workshop on Semantic Evaluation (SemEval-2025)
Jithu Morrison S | Janani Hariharakrishnan | Harsh Pratap Singh
Proceedings of the 19th International Workshop on Semantic Evaluation (SemEval-2025)
Emotion recognition in textual data is a crucial NLP task with applications in sentiment analysis and mental health monitoring. SemEval 2025 Task 11 introduces a multilingual dataset spanning 28 languages, including low-resource ones, to improve cross-lingual emotion detection. Our approach utilizes T5 for English and mT5 for other languages, fine-tuning them for multi-label classification and emotion intensity estimation. Our findings demonstrate the effectiveness of transformer-based models in capturing nuanced emotional expressions across diverse languages.