LexiLogic at BEA 2025 Shared Task: Fine-tuning Transformer Language Models for the Pedagogical Skill Evaluation of LLM-based tutors
Souvik Bhattacharyya, Billodal Roy, Niranjan M, Pranav Gupta
Abstract
While large language models show promise as AI tutors, evaluating their pedagogical capabilities remains challenging. In this paper, we, team LexiLogic presents our participation in the BEA 2025 shared task on evaluating AI tutors across five dimensions: Mistake Identification, Mistake Location, Providing Guidance, Actionability, and Tutor Identification. We approach all tracks as classification tasks using fine-tuned transformer models on a dataset of 300 educational dialogues between a student and a tutor in the mathematical domain. Our results show varying performance across tracks, with macro average F1 scores ranging from 0.47 to 0.82, achieving rankings between 4th and 31st place. Such models have the potential to be used in developing automated scoring metrics for assessing the pedagogical skills of AI math tutors.- Anthology ID:
- 2025.bea-1.93
- Volume:
- Proceedings of the 20th Workshop on Innovative Use of NLP for Building Educational Applications (BEA 2025)
- Month:
- July
- Year:
- 2025
- Address:
- Vienna, Austria
- Editors:
- Ekaterina Kochmar, Bashar Alhafni, Marie Bexte, Jill Burstein, Andrea Horbach, Ronja Laarmann-Quante, Anaïs Tack, Victoria Yaneva, Zheng Yuan
- Venues:
- BEA | WS
- SIG:
- Publisher:
- Association for Computational Linguistics
- Note:
- Pages:
- 1180–1186
- Language:
- URL:
- https://preview.aclanthology.org/acl25-workshop-ingestion/2025.bea-1.93/
- DOI:
- Cite (ACL):
- Souvik Bhattacharyya, Billodal Roy, Niranjan M, and Pranav Gupta. 2025. LexiLogic at BEA 2025 Shared Task: Fine-tuning Transformer Language Models for the Pedagogical Skill Evaluation of LLM-based tutors. In Proceedings of the 20th Workshop on Innovative Use of NLP for Building Educational Applications (BEA 2025), pages 1180–1186, Vienna, Austria. Association for Computational Linguistics.
- Cite (Informal):
- LexiLogic at BEA 2025 Shared Task: Fine-tuning Transformer Language Models for the Pedagogical Skill Evaluation of LLM-based tutors (Bhattacharyya et al., BEA 2025)
- PDF:
- https://preview.aclanthology.org/acl25-workshop-ingestion/2025.bea-1.93.pdf