Chad Marchong

2025

Comparing AI tools and Human Raters in Predicting Reading Item Difficulty
Hongli Li | Roula Aldib | Chad Marchong | Kevin Fan
Proceedings of the Artificial Intelligence in Measurement and Education Conference (AIME-Con): Works in Progress

This study compares AI tools and human raters in predicting the difficulty of reading comprehension items without response data. Predictions from AI models (ChatGPT, Gemini, Claude, and DeepSeek) and human raters are evaluated against empirical difficulty values derived from student responses. Findings will inform AI’s potential to support test development.

Co-authors

Roula Aldib 1
Kevin Fan 1
Hongli Li 1

Venues

aimecon1

Fix author