Susan Lottridge

2025

pdf bib abs
Leveraging Fine-tuned Large Language Models in Item Parameter Prediction
Suhwa Han | Frank Rijmen | Allison Ames Boykin | Susan Lottridge
Proceedings of the Artificial Intelligence in Measurement and Education Conference (AIME-Con): Full Papers

The study introduces novel approaches for fine-tuning pre-trained LLMs to predict item response theory parameters directly from item texts and structured item attribute variables. The proposed methods were evaluated on a dataset over 1,000 English Language Art items that are currently in the operational pool for a large scale assessment.

pdf bib abs
Examining decoding items using engine transcriptions and scoring in early literacy assessment
Zachary Schultz | Mackenzie Young | Debbie Dugdale | Susan Lottridge
Proceedings of the Artificial Intelligence in Measurement and Education Conference (AIME-Con): Works in Progress

We investigate the reliability of two scoring approaches to early literacy decoding items, whereby students are shown a word and asked to say it aloud. Approaches were rubric scoring of speech, human or AI transcription with varying explicit scoring rules. Initial results suggest rubric-based approaches perform better than transcription-based methods.

pdf bib abs
The Impact of an NLP-Based Writing Tool on Student Writing
Karthik Sairam | Amy Burkhardt | Susan Lottridge
Proceedings of the Artificial Intelligence in Measurement and Education Conference (AIME-Con): Coordinated Session Papers

We present preliminary evidence on the impact of a NLP-based writing feedback tool, Write-On with Cambi! on students’ argumentative writing. Students were randomly assigned to receive access to the tool or not, and their essay scores were compared across three rubric dimensions; estimated effect sizes (Cohen’s d) ranged from 0.25 to 0.26 (with notable variation in the average treatment effect across classrooms). To characterize and compare the groups’ writing processes, we implemented an algorithm that classified each revision as Appended (new text added to the end), Surface-level (minor within-text corrections to conventions), or Substantive (larger within-text changes or additions). We interpret within-text edits (Surface-level or Substantive) as potential markers of metacognitive engagement in revision, and note that these within-text edits are more common in students who had access to the tool. Together, these pilot analyses serve as a first step in testing the tool’s theory of action.

pdf bib abs
SMART: Simulated Students Aligned with Item Response Theory for Question Difficulty Prediction
Alexander Scarlatos | Nigel Fernandez | Christopher Ormerod | Susan Lottridge | Andrew Lan
Proceedings of the 2025 Conference on Empirical Methods in Natural Language Processing

Item (question) difficulties play a crucial role in educational assessments, enabling accurate and efficient assessment of student abilities and personalization to maximize learning outcomes. Traditionally, estimating item difficulties can be costly, requiring real students to respond to items, followed by fitting an item response theory (IRT) model to get difficulty estimates. This approach cannot be applied to the cold-start setting for previously unseen items either. In this work, we present SMART (Simulated Students Aligned with IRT), a novel method for aligning simulated students with instructed ability, which can then be used in simulations to predict the difficulty of open-ended items. We achieve this alignment using direct preference optimization (DPO), where we form preference pairs based on how likely responses are under a ground-truth IRT model. We perform a simulation by generating thousands of responses, evaluating them with a large language model (LLM)-based scoring model, and fit the resulting data to an IRT model to obtain item difficulty estimates. Through extensive experiments on two real-world student response datasets, we show that SMART outperforms other item difficulty prediction methods by leveraging its improved ability alignment.

Co-authors

Andrew Lan 1

Christopher Ormerod 1

Frank Rijmen 1

Karthik Sairam 1

Alexander Scarlatos 1

Zachary Schultz 1

Mackenzie Young 1

Venues

aimecon3
emnlp1

Fix author