From Entropy to Generalizability: Strengthening Automated Essay Scoring Reliability and Sustainability

Yi Gui

From Entropy to Generalizability: Strengthening Automated Essay Scoring Reliability and Sustainability

Abstract

Generalizability Theory with entropy-derived stratification optimized automated essay scoring reliability. A G-study decomposed variance across 14 encoders and 3 seeds; D-studies identified minimal ensembles achieving G ≥ 0.85. A hybrid of one medium and one small encoder with two seeds maximized dependability per compute cost. Stratification ensured uniform precision across

Anthology ID:: 2025.aimecon-main.34
Volume:: Proceedings of the Artificial Intelligence in Measurement and Education Conference (AIME-Con): Full Papers
Month:: October
Year:: 2025
Address:: Wyndham Grand Pittsburgh, Downtown, Pittsburgh, Pennsylvania, United States
Editors:: Joshua Wilson, Christopher Ormerod, Magdalen Beiting Parrish
Venue:: AIME-Con
SIG:
Publisher:: National Council on Measurement in Education (NCME)
Note:
Pages:: 312–328
Language:
URL:: https://preview.aclanthology.org/more-markup/2025.aimecon-main.34/
DOI:
Bibkey:
Cite (ACL):: Yi Gui. 2025. From Entropy to Generalizability: Strengthening Automated Essay Scoring Reliability and Sustainability. In Proceedings of the Artificial Intelligence in Measurement and Education Conference (AIME-Con): Full Papers, pages 312–328, Wyndham Grand Pittsburgh, Downtown, Pittsburgh, Pennsylvania, United States. National Council on Measurement in Education (NCME).
Cite (Informal):: From Entropy to Generalizability: Strengthening Automated Essay Scoring Reliability and Sustainability (Gui, AIME-Con 2025)
Copy Citation:
PDF:: https://preview.aclanthology.org/more-markup/2025.aimecon-main.34.pdf

PDF Cite Search Fix data