Enhancing Marker Scoring Accuracy through Ordinal Confidence Modelling in Educational Assessments

Abhirup Chakravarty, Mark Brenchley, Trevor Breakspear, Ian Lewin, Yan Huang


Abstract
A key ethical challenge in Automated Essay Scoring (AES) is ensuring that scores are only released when they meet high reliability standards. Confidence modelling addresses this by assigning a reliability estimate measure, in the form of a confidence score, to each automated score. In this study, we frame confidence estimation as a classification task: predicting whether an AES-generated score correctly places a candidate in the appropriate CEFR level. While this is a binary decision, we leverage the inherent granularity of the scoring domain in two ways. First, we reformulate the task as an n-ary classification problem using score binning. Second, we introduce a set of novel Kernel Weighted Ordinal Categorical Cross Entropy (KWOCCE) loss functions that incorporate the ordinal structure of CEFR labels. Our best-performing model achieves an F1 score of 0.97, and enables the system to release 47% of scores with 100% CEFR agreement and 99% with at least 95% CEFR agreement — compared to ≈ 92 % CEFR agreement from the standalone AES model where we release all AM predicted scores.
Anthology ID:
2025.acl-industry.106
Volume:
Proceedings of the 63rd Annual Meeting of the Association for Computational Linguistics (Volume 6: Industry Track)
Month:
July
Year:
2025
Address:
Vienna, Austria
Editors:
Georg Rehm, Yunyao Li
Venue:
ACL
SIG:
Publisher:
Association for Computational Linguistics
Note:
Pages:
1498–1507
Language:
URL:
https://preview.aclanthology.org/display_plenaries/2025.acl-industry.106/
DOI:
Bibkey:
Cite (ACL):
Abhirup Chakravarty, Mark Brenchley, Trevor Breakspear, Ian Lewin, and Yan Huang. 2025. Enhancing Marker Scoring Accuracy through Ordinal Confidence Modelling in Educational Assessments. In Proceedings of the 63rd Annual Meeting of the Association for Computational Linguistics (Volume 6: Industry Track), pages 1498–1507, Vienna, Austria. Association for Computational Linguistics.
Cite (Informal):
Enhancing Marker Scoring Accuracy through Ordinal Confidence Modelling in Educational Assessments (Chakravarty et al., ACL 2025)
Copy Citation:
PDF:
https://preview.aclanthology.org/display_plenaries/2025.acl-industry.106.pdf