Data-lean fine-tuning of models for evaluating teacher performance in a GenAI-led elicitation simulation
Beata Beigman Klebanov, Andrew Hoang, Jamie Mikeska, Benny Longwill, Sanjna Kashyap, Shreyashi Halder, Aakanksha Bhatia
Abstract
Recent advances in the capabilities of conversational agents based on large language models make them a very promising tool for role playing K-12 students in order to train educators in conversational teaching practices, such as eliciting student thinking, explaining disciplinary content, and facilitating a classroom discussion. In fact, such simulations can and have been developed relatively quickly and without data to machine-learn from – neither classroom data nor human-simulated data. To enhance the usefulness and effectiveness of such teaching simulations, it is necessary to provide pedagogically sound, timely, and personalized feedback to the educator about their simulation performance. In this study, we present experiments on fine-tuning models to evaluate educator performance in an elicitation teaching simulation. The models are developed with data collected during usability testing of the simulation and evaluated on real user data. We show that even with relatively little fine-tuning data, robust performance can be obtained- Anthology ID:
- 2026.bea-1.38
- Volume:
- Proceedings of the 21st Workshop on Innovative Use of NLP for Building Educational Applications (BEA 2026)
- Month:
- July
- Year:
- 2026
- Address:
- San Diego, California, USA
- Editors:
- Ekaterina Kochmar, Bashar Alhafni, Stefano Bannò, Marie Bexte, Jill Burstein, Andrea Horbach, Ronja Laarmann-Quante, Anais Tack, Victoria Yaneva, Zheng Yuan
- Venues:
- BEA | WS
- SIG:
- Publisher:
- Association for Computational Linguistics
- Note:
- Pages:
- 546–561
- Language:
- URL:
- https://preview.aclanthology.org/ingest-acl-workshops/2026.bea-1.38/
- DOI:
- Cite (ACL):
- Beata Beigman Klebanov, Andrew Hoang, Jamie Mikeska, Benny Longwill, Sanjna Kashyap, Shreyashi Halder, and Aakanksha Bhatia. 2026. Data-lean fine-tuning of models for evaluating teacher performance in a GenAI-led elicitation simulation. In Proceedings of the 21st Workshop on Innovative Use of NLP for Building Educational Applications (BEA 2026), pages 546–561, San Diego, California, USA. Association for Computational Linguistics.
- Cite (Informal):
- Data-lean fine-tuning of models for evaluating teacher performance in a GenAI-led elicitation simulation (Beigman Klebanov et al., BEA 2026)
- PDF:
- https://preview.aclanthology.org/ingest-acl-workshops/2026.bea-1.38.pdf