Evelyn Johnson

2025

Predicting and Evaluating Item Responses Using Machine Learning, Text Embeddings, and LLMs
Evelyn Johnson | Hsin-Ro Wei | Tong Wu | Huan Liu
Proceedings of the Artificial Intelligence in Measurement and Education Conference (AIME-Con): Works in Progress

This work-in-progress study compares the accuracy of machine learning and large language models to predict student responses to field-test items on a social-emotional learning assessment. We evaluate how well each method replicates actual responses and examine the item parameters generated by synthetic data to those derived from actual student data.

pdf bib abs

Simulating Rating Scale Responses with LLMs for Early-Stage Item Evaluation
Onur Demirkaya | Hsin-Ro Wei | Evelyn Johnson
Proceedings of the Artificial Intelligence in Measurement and Education Conference (AIME-Con): Full Papers

This study explores the use of large language models to simulate human responses to Likert-scale items. A DeBERTa-base model fine-tuned with item text and examinee ability emulates a graded response model (GRM). High alignment with GRM probabilities and reasonable threshold recovery support LLMs as scalable tools for early-stage item evaluation.

Co-authors

Venues

AIME-Con2

Fix author