Beyond the Gold Standard in Analytic Automated Essay Scoring

Gabrielle Gaudeau


Abstract
Originally developed to reduce the manual burden of grading standardised language tests, Automated Essay Scoring (AES) research has long focused on holistic scoring methods which offer minimal formative feedback in the classroom. With the increasing demand for technological tools that support language acquisition, the field is turning to analytic AES (evaluating essays according to different linguistic traits). This approach holds promise for generating more detailed essay feedback, but relies on analytic scoring data that is both more cognitively demanding for humans to produce, and prone to bias. The dominant paradigm in AES is to aggregate disagreements between raters into a single gold-standard label, which fails to account for genuine examiner variability. In an attempt to make AES more representative and trustworthy, we propose to explore the sources of disagreements and lay out a novel AES system design that learns from individual raters instead of the gold standard labels.
Anthology ID:
2025.acl-srw.2
Volume:
Proceedings of the 63rd Annual Meeting of the Association for Computational Linguistics (Volume 4: Student Research Workshop)
Month:
July
Year:
2025
Address:
Vienna, Austria
Editors:
Jin Zhao, Mingyang Wang, Zhu Liu
Venues:
ACL | WS
SIG:
Publisher:
Association for Computational Linguistics
Note:
Pages:
18–39
Language:
URL:
https://preview.aclanthology.org/landing_page/2025.acl-srw.2/
DOI:
Bibkey:
Cite (ACL):
Gabrielle Gaudeau. 2025. Beyond the Gold Standard in Analytic Automated Essay Scoring. In Proceedings of the 63rd Annual Meeting of the Association for Computational Linguistics (Volume 4: Student Research Workshop), pages 18–39, Vienna, Austria. Association for Computational Linguistics.
Cite (Informal):
Beyond the Gold Standard in Analytic Automated Essay Scoring (Gaudeau, ACL 2025)
Copy Citation:
PDF:
https://preview.aclanthology.org/landing_page/2025.acl-srw.2.pdf