Mehrdad Yousefpoori-Naeim


Using Machine Learning to Predict Item Difficulty and Response Time in Medical Tests
Mehrdad Yousefpoori-Naeim | Shayan Zargari | Zahra Hatami
Proceedings of the 19th Workshop on Innovative Use of NLP for Building Educational Applications (BEA 2024)

Prior knowledge of item characteristics, suchas difficulty and response time, without pretestingitems can substantially save time and costin high-standard test development. Using a varietyof machine learning (ML) algorithms, thepresent study explored several (non-)linguisticfeatures (such as Coh-Metrix indices) alongwith MPNet word embeddings to predict thedifficulty and response time of a sample of medicaltest items. In both prediction tasks, thecontribution of embeddings to models alreadycontaining other features was found to be extremelylimited. Moreover, a comparison offeature importance scores across the two predictiontasks revealed that cohesion-based featureswere the strongest predictors of difficulty, whilethe prediction of response time was primarilydependent on length-related features.