QG-SMS: Enhancing Test Item Analysis via Student Modeling and Simulation

Bang Nguyen, Tingting Du, Mengxia Yu, Lawrence Angrave, Meng Jiang


Abstract
While the Question Generation (QG) task has been increasingly adopted in educational assessments, its evaluation remains limited by approaches that lack a clear connection to the educational values of test items. In this work, we introduce test item analysis, a method frequently used by educators to assess test question quality, into QG evaluation. Specifically, we construct pairs of candidate questions that differ in quality across dimensions such as topic coverage, item difficulty, item discrimination, and distractor efficiency. We then examine whether existing QG evaluation approaches can effectively distinguish these differences. Our findings reveal significant shortcomings in these approaches with respect to accurately assessing test item quality in relation to student performance. To address this gap, we propose a novel QG evaluation framework, QG-SMS, which leverages Large Language Model for Student Modeling and Simulation to perform test item analysis. As demonstrated in our extensive experiments and human evaluation study, the additional perspectives introduced by the simulated student profiles lead to a more effective and robust assessment of test items.
Anthology ID:
2025.acl-long.1268
Volume:
Proceedings of the 63rd Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers)
Month:
July
Year:
2025
Address:
Vienna, Austria
Editors:
Wanxiang Che, Joyce Nabende, Ekaterina Shutova, Mohammad Taher Pilehvar
Venue:
ACL
SIG:
Publisher:
Association for Computational Linguistics
Note:
Pages:
26152–26168
Language:
URL:
https://preview.aclanthology.org/ingestion-acl-25/2025.acl-long.1268/
DOI:
Bibkey:
Cite (ACL):
Bang Nguyen, Tingting Du, Mengxia Yu, Lawrence Angrave, and Meng Jiang. 2025. QG-SMS: Enhancing Test Item Analysis via Student Modeling and Simulation. In Proceedings of the 63rd Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), pages 26152–26168, Vienna, Austria. Association for Computational Linguistics.
Cite (Informal):
QG-SMS: Enhancing Test Item Analysis via Student Modeling and Simulation (Nguyen et al., ACL 2025)
Copy Citation:
PDF:
https://preview.aclanthology.org/ingestion-acl-25/2025.acl-long.1268.pdf