Erin Walker


2025

pdf bib
Can LLMs simulate the same correct solutions to free-response math problems as real students?
Yuya Asano | Diane Litman | Erin Walker
Proceedings of the 2025 Conference on Empirical Methods in Natural Language Processing

Large language models (LLMs) have emerged as powerful tools for developing educational systems. While previous studies have explored modeling student mistakes, a critical gap remains in understanding whether LLMs can generate correct solutions that represent student responses to free-response problems. In this paper, we compare the distribution of solutions produced by four LLMs (one proprietary, two open-sourced general, and one open-sourced math models) with various sampling and prompting techniques and those generated by students, using conversations where students teach math problems to a conversational robot. Our study reveals discrepancies between the correct solutions produced by LLMs and by students. We discuss the practical implications of these findings for the design and evaluation of LLM-supported educational systems.

2022

pdf bib
Comparison of Lexical Alignment with a Teachable Robot in Human-Robot and Human-Human-Robot Interactions
Yuya Asano | Diane Litman | Mingzhi Yu | Nikki Lobczowski | Timothy Nokes-Malach | Adriana Kovashka | Erin Walker
Proceedings of the 23rd Annual Meeting of the Special Interest Group on Discourse and Dialogue

Speakers build rapport in the process of aligning conversational behaviors with each other. Rapport engendered with a teachable agent while instructing domain material has been shown to promote learning. Past work on lexical alignment in the field of education suffers from limitations in both the measures used to quantify alignment and the types of interactions in which alignment with agents has been studied. In this paper, we apply alignment measures based on a data-driven notion of shared expressions (possibly composed of multiple words) and compare alignment in one-on-one human-robot (H-R) interactions with the H-R portions of collaborative human-human-robot (H-H-R) interactions. We find that students in the H-R setting align with a teachable robot more than in the H-H-R setting and that the relationship between lexical alignment and rapport is more complex than what is predicted by previous theoretical and empirical work.