Predicting Item Difficulty and Generating Reading Comprehension Items via an Annotated Repository

Radhika Kapoor; Mayank Sharma; Sang Truong; Nick Haber; Ben Domingue; Maria Ruiz-Primo

Predicting Item Difficulty and Generating Reading Comprehension Items via an Annotated Repository

Radhika Kapoor, Mayank Sharma, Sang Truong, Nick Haber, Ben Domingue, Maria Ruiz-Primo

Abstract

Prediction of item difficulty from its text content is of substantial interest for automated generation of test items. In this paper, we focus on the related problem of recovering IRT-based difficulty when the data originally reported item p-value (percent correct responses). We model this item difficulty using a repository of reading passages and student data from US standardized tests from New York and Texas for grades 3-8 spanning the years 2018-23. This repository is annotated with meta-data on (1) linguistic features of the reading items, (2) test features of the passage, and (3) context features. Using a penalized regression model, we achieve an RMSE of 0.59 (compared to a 0.92 baseline) and a 0.77 correlation between true and predicted difficulty. We further evaluated the impact of LLM embeddings (ModernBERT, BERT, and LLaMA), finding that they marginally improve performance but function effectively as standalone alternatives to traditional linguistic features. Finally, we demonstrate how this difficulty prediction model powers a publicly available, human-in-the-loop tool for generating reading comprehension items.

Anthology ID:: 2026.bea-1.53
Volume:: Proceedings of the 21st Workshop on Innovative Use of NLP for Building Educational Applications (BEA 2026)
Month:: July
Year:: 2026
Address:: San Diego, California, USA
Editors:: Ekaterina Kochmar, Bashar Alhafni, Stefano Bannò, Marie Bexte, Jill Burstein, Andrea Horbach, Ronja Laarmann-Quante, Anais Tack, Victoria Yaneva, Zheng Yuan
Venues:: BEA | WS
SIG:
Publisher:: Association for Computational Linguistics
Note:
Pages:: 777–797
Language:
URL:: https://preview.aclanthology.org/ingest-acl-workshops/2026.bea-1.53/
DOI:
Bibkey:
Cite (ACL):: Radhika Kapoor, Mayank Sharma, Sang Truong, Nick Haber, Ben Domingue, and Maria Ruiz-Primo. 2026. Predicting Item Difficulty and Generating Reading Comprehension Items via an Annotated Repository. In Proceedings of the 21st Workshop on Innovative Use of NLP for Building Educational Applications (BEA 2026), pages 777–797, San Diego, California, USA. Association for Computational Linguistics.
Cite (Informal):: Predicting Item Difficulty and Generating Reading Comprehension Items via an Annotated Repository (Kapoor et al., BEA 2026)
Copy Citation:
PDF:: https://preview.aclanthology.org/ingest-acl-workshops/2026.bea-1.53.pdf

PDF Cite Search Fix data