Yi Zhan
Other people with similar names: Yi Zhan
Unverified author pages with similar names: Yi Zhan
2026
LongTutor: Benchmarking Large Language Models for Long-term Personalized Tutoring
Ning Li | Zheng Zhang | Zhenya Huang | Rui Li | Yi Zhan | Yinbo Luo | Qi Liu | Enhong Chen
Proceedings of the 64th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers)
Ning Li | Zheng Zhang | Zhenya Huang | Rui Li | Yi Zhan | Yinbo Luo | Qi Liu | Enhong Chen
Proceedings of the 64th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers)
The rapid advancement of large language models (LLMs) has driven the deployment of LLM-based AI tutors on online learning platforms. This widespread adoption highlights an urgent need for systematic benchmarks to evaluate their tutoring capabilities. However, existing evaluations predominantly focus on isolated, short-term interactions, overlooking the inherently long-term nature of learning. To bridge this gap, we introduce LongTutor, a benchmark for long-term personalized tutoring grounded in formative assessment theory. Built from expert-annotated real-world learning logs, LongTutor evaluates LLMs across three progressive tasks: historical evidence acquisition, knowledge state diagnosis, and adaptive teaching action. Our experiments reveal a critical capability mismatch: while LLMs excel at evidence acquisition, they struggle to effectively leverage long-term history for accurate diagnosis and adaptive teaching. To enable scalable benchmark expansion, we further propose an automated generator–verifier pipeline, paving the way toward truly long-term AI tutoring systems.