NLIP at BEA 2025 Shared Task: Evaluation of Pedagogical Ability of AI Tutors

Trishita Saha, Shrenik Ganguli, Maunendra Sankar Desarkar


Abstract
This paper describes the system created for the BEA 2025 Shared Task on Pedagogical Ability Assessment of AI-powered Tutors. The task aims to assess how well AI tutors identify and locate errors made by students, provide guidance and ensure actionability, among other features of their responses in educational dialogues. Transformer-based models, especially DeBERTa and RoBERTa, are improved by multitask learning, threshold tweaking, ordinal regression, and oversampling. The efficiency of pedagogically driven training methods and bespoke transformer models for evaluating AI tutor quality is demonstrated by the high performance of their best systems across all evaluation tracks.
Anthology ID:
2025.bea-1.99
Volume:
Proceedings of the 20th Workshop on Innovative Use of NLP for Building Educational Applications (BEA 2025)
Month:
July
Year:
2025
Address:
Vienna, Austria
Editors:
Ekaterina Kochmar, Bashar Alhafni, Marie Bexte, Jill Burstein, Andrea Horbach, Ronja Laarmann-Quante, Anaïs Tack, Victoria Yaneva, Zheng Yuan
Venues:
BEA | WS
SIG:
Publisher:
Association for Computational Linguistics
Note:
Pages:
1242–1253
Language:
URL:
https://preview.aclanthology.org/acl25-workshop-ingestion/2025.bea-1.99/
DOI:
Bibkey:
Cite (ACL):
Trishita Saha, Shrenik Ganguli, and Maunendra Sankar Desarkar. 2025. NLIP at BEA 2025 Shared Task: Evaluation of Pedagogical Ability of AI Tutors. In Proceedings of the 20th Workshop on Innovative Use of NLP for Building Educational Applications (BEA 2025), pages 1242–1253, Vienna, Austria. Association for Computational Linguistics.
Cite (Informal):
NLIP at BEA 2025 Shared Task: Evaluation of Pedagogical Ability of AI Tutors (Saha et al., BEA 2025)
Copy Citation:
PDF:
https://preview.aclanthology.org/acl25-workshop-ingestion/2025.bea-1.99.pdf