@inproceedings{asano-etal-2025-llms,
    title = "Can {LLM}s simulate the same correct solutions to free-response math problems as real students?",
    author = "Asano, Yuya  and
      Litman, Diane  and
      Walker, Erin",
    editor = "Christodoulopoulos, Christos  and
      Chakraborty, Tanmoy  and
      Rose, Carolyn  and
      Peng, Violet",
    booktitle = "Proceedings of the 2025 Conference on Empirical Methods in Natural Language Processing",
    month = nov,
    year = "2025",
    address = "Suzhou, China",
    publisher = "Association for Computational Linguistics",
    url = "https://preview.aclanthology.org/ingest-emnlp/2025.emnlp-main.827/",
    pages = "16347--16376",
    ISBN = "979-8-89176-332-6",
    abstract = "Large language models (LLMs) have emerged as powerful tools for developing educational systems. While previous studies have explored modeling student mistakes, a critical gap remains in understanding whether LLMs can generate correct solutions that represent student responses to free-response problems. In this paper, we compare the distribution of solutions produced by four LLMs (one proprietary, two open-sourced general, and one open-sourced math models) with various sampling and prompting techniques and those generated by students, using conversations where students teach math problems to a conversational robot. Our study reveals discrepancies between the correct solutions produced by LLMs and by students. We discuss the practical implications of these findings for the design and evaluation of LLM-supported educational systems."
}Markdown (Informal)
[Can LLMs simulate the same correct solutions to free-response math problems as real students?](https://preview.aclanthology.org/ingest-emnlp/2025.emnlp-main.827/) (Asano et al., EMNLP 2025)
ACL