Sparsh Rastogi
2025
Thapar Titan/s : Fine-Tuning Pretrained Language Models with Contextual Augmentation for Mistake Identification in Tutor–Student Dialogues
Harsh Dadwal
|
Sparsh Rastogi
|
Jatin Bedi
Proceedings of the 20th Workshop on Innovative Use of NLP for Building Educational Applications (BEA 2025)
This paper presents Thapar Titan/s’ submission to the BEA 2025 Shared Task on Pedagogical Ability Assessment of AI-powered Tutors. The shared task consists of five subtasks; our team ranked 18th in Mistake Identification, 15th in Mistake Location, and 18th in Actionability. However, in this paper, we focus exclusively on presenting results for Task 1: Mistake Identification, which evaluates a system’s ability to detect student mistakes.Our approach employs contextual data augmentation using a RoBERTa based masked language model to mitigate class imbalance, supplemented by oversampling and weighted loss training. Subsequently, we fine-tune three separate classifiers: RoBERTa, BERT, and DeBERTa for three-way classification aligned with task-specific annotation schemas. This modular and scalable pipeline enables a comprehensive evaluation of tutor feedback quality in educational dialogues.