Comparing Behavioral Patterns of LLM and Human Tutors: A Population-level Analysis with the CIMA Dataset

Aayush Kucheria; Nitin Sawhney; Arto Hellas

Comparing Behavioral Patterns of LLM and Human Tutors: A Population-level Analysis with the CIMA Dataset

Aayush Kucheria, Nitin Sawhney, Arto Hellas

Abstract

Large Language Models (LLMs) offer exciting potential as educational tutors, and much research explores this potential. Unfortunately, there’s little research in understanding the baseline behavioral pattern differences that LLM tutors exhibit, in contrast to human tutors. We conduct a preliminary study of these differences with the CIMA dataset and three state-of-the-art LLMs (GPT-4o, Gemini Pro 1.5, and LLaMA 3.1 450B). Our results reveal systematic deviations in these baseline patterns, particulary in the tutoring actions selected, complexity of responses, and even within different LLMs. This research brings forward some early results in understanding how LLMs when deployed as tutors exhibit systematic differences, which has implications for educational technology design and deployment. We note that while LLMs enable more powerful and fluid interaction than previous systems, they simultaneously develop characteristic patterns distinct from human teaching. Understanding these differences can inform better integration of AI in educational settings.

Anthology ID:: 2025.bea-1.64
Volume:: Proceedings of the 20th Workshop on Innovative Use of NLP for Building Educational Applications (BEA 2025)
Month:: July
Year:: 2025
Address:: Vienna, Austria
Editors:: Ekaterina Kochmar, Bashar Alhafni, Marie Bexte, Jill Burstein, Andrea Horbach, Ronja Laarmann-Quante, Anaïs Tack, Victoria Yaneva, Zheng Yuan
Venues:: BEA | WS
SIG:
Publisher:: Association for Computational Linguistics
Note:
Pages:: 873–881
Language:
URL:: https://preview.aclanthology.org/landing_page/2025.bea-1.64/
DOI:
Bibkey:
Cite (ACL):: Aayush Kucheria, Nitin Sawhney, and Arto Hellas. 2025. Comparing Behavioral Patterns of LLM and Human Tutors: A Population-level Analysis with the CIMA Dataset. In Proceedings of the 20th Workshop on Innovative Use of NLP for Building Educational Applications (BEA 2025), pages 873–881, Vienna, Austria. Association for Computational Linguistics.
Cite (Informal):: Comparing Behavioral Patterns of LLM and Human Tutors: A Population-level Analysis with the CIMA Dataset (Kucheria et al., BEA 2025)
Copy Citation:
PDF:: https://preview.aclanthology.org/landing_page/2025.bea-1.64.pdf

PDF Cite Search Fix data