DREsS: Dataset for Rubric-based Essay Scoring on EFL Writing

Haneul Yoo; Jieun Han; So-Yeon Ahn; Alice Oh

doi:10.18653/v1/2025.acl-long.659

DREsS: Dataset for Rubric-based Essay Scoring on EFL Writing

Haneul Yoo, Jieun Han, So-Yeon Ahn, Alice Oh

Abstract

Automated essay scoring (AES) is a useful tool in English as a Foreign Language (EFL) writing education, offering real-time essay scores for students and instructors. However, previous AES models were trained on essays and scores irrelevant to the practical scenarios of EFL writing education and usually provided a single holistic score due to the lack of appropriate datasets. In this paper, we release DREsS, a large-scale, standard dataset for rubric-based automated essay scoring with 48.9K samples in total. DREsS comprises three sub-datasets: DREsS_New, DREsS_Std., and DREsS_CASE. We collect DREsS_New, a real-classroom dataset with 2.3K essays authored by EFL undergraduate students and scored by English education experts. We also standardize existing rubric-based essay scoring datasets as DREsS_Std. We suggest CASE, a corruption-based augmentation strategy for essays, which generates 40.1K synthetic samples of DREsS_CASE and improves the baseline results by 45.44%. DREsS will enable further research to provide a more accurate and practical AES system for EFL writing education.

Anthology ID:: 2025.acl-long.659
Volume:: Proceedings of the 63rd Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers)
Month:: July
Year:: 2025
Address:: Vienna, Austria
Editors:: Wanxiang Che, Joyce Nabende, Ekaterina Shutova, Mohammad Taher Pilehvar
Venue:: ACL
SIG:
Publisher:: Association for Computational Linguistics
Note:
Pages:: 13439–13454
Language:
URL:: https://preview.aclanthology.org/ingest-emnlp/2025.acl-long.659/
DOI:: 10.18653/v1/2025.acl-long.659
Bibkey:
Cite (ACL):: Haneul Yoo, Jieun Han, So-Yeon Ahn, and Alice Oh. 2025. DREsS: Dataset for Rubric-based Essay Scoring on EFL Writing. In Proceedings of the 63rd Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), pages 13439–13454, Vienna, Austria. Association for Computational Linguistics.
Cite (Informal):: DREsS: Dataset for Rubric-based Essay Scoring on EFL Writing (Yoo et al., ACL 2025)
Copy Citation:
PDF:: https://preview.aclanthology.org/ingest-emnlp/2025.acl-long.659.pdf

PDF Cite Search Fix data