Predicting Item Difficulty and Generating Reading Comprehension Items via an Annotated Repository
Radhika Kapoor, Mayank Sharma, Sang Truong, Nick Haber, Ben Domingue, Maria Ruiz-Primo
Abstract
Prediction of item difficulty from its text content is of substantial interest for automated generation of test items. In this paper, we focus on the related problem of recovering IRT-based difficulty when the data originally reported item p-value (percent correct responses). We model this item difficulty using a repository of reading passages and student data from US standardized tests from New York and Texas for grades 3-8 spanning the years 2018-23. This repository is annotated with meta-data on (1) linguistic features of the reading items, (2) test features of the passage, and (3) context features. Using a penalized regression model, we achieve an RMSE of 0.59 (compared to a 0.92 baseline) and a 0.77 correlation between true and predicted difficulty. We further evaluated the impact of LLM embeddings (ModernBERT, BERT, and LLaMA), finding that they marginally improve performance but function effectively as standalone alternatives to traditional linguistic features. Finally, we demonstrate how this difficulty prediction model powers a publicly available, human-in-the-loop tool for generating reading comprehension items.- Anthology ID:
- 2026.bea-1.53
- Volume:
- Proceedings of the 21st Workshop on Innovative Use of NLP for Building Educational Applications (BEA 2026)
- Month:
- July
- Year:
- 2026
- Address:
- San Diego, California, USA
- Editors:
- Ekaterina Kochmar, Bashar Alhafni, Stefano Bannò, Marie Bexte, Jill Burstein, Andrea Horbach, Ronja Laarmann-Quante, Anais Tack, Victoria Yaneva, Zheng Yuan
- Venues:
- BEA | WS
- SIG:
- Publisher:
- Association for Computational Linguistics
- Note:
- Pages:
- 777–797
- Language:
- URL:
- https://preview.aclanthology.org/ingest-acl-workshops/2026.bea-1.53/
- DOI:
- Cite (ACL):
- Radhika Kapoor, Mayank Sharma, Sang Truong, Nick Haber, Ben Domingue, and Maria Ruiz-Primo. 2026. Predicting Item Difficulty and Generating Reading Comprehension Items via an Annotated Repository. In Proceedings of the 21st Workshop on Innovative Use of NLP for Building Educational Applications (BEA 2026), pages 777–797, San Diego, California, USA. Association for Computational Linguistics.
- Cite (Informal):
- Predicting Item Difficulty and Generating Reading Comprehension Items via an Annotated Repository (Kapoor et al., BEA 2026)
- PDF:
- https://preview.aclanthology.org/ingest-acl-workshops/2026.bea-1.53.pdf