Evaluating Automated Scoring Models on Official ENEM Essays
Laís Nuto Rossman, Igor Cataneo Silveira, Denis Deratani Mauá
Abstract
Automated Essay Scoring systems can relieve teachers of this laborious task and allow students to practice more frequently due to faster feedback cycles. In Brazilian Portuguese, there is growing interest in automatic scoring systems for the standardized ENEM exam. However, the only available datasets consist of essays written as practice for the official exam. In the literature, to the best of our knowledge, there is no work that evaluates official ENEM essays using mock-exam datasets.This work fills that gap by presenting a new labeled dataset composed of 157 essays written for the official ENEM exam. The analysis shows that this dataset shares characteristics similar to existing datasets of mock exam essays. The results also indicate that, for small datasets such as this one, the use of LLMs pretrained on mock exams significantly improves the performance of automatic scorers for official ENEM essays, yielding an average gain of 0.27 points in the Quadratic Weighted Kappa metric compared to training solely on official data.- Anthology ID:
- 2026.propor-1.16
- Volume:
- Proceedings of the 17th International Conference on Computational Processing of Portuguese (PROPOR 2026) - Vol. 1
- Month:
- April
- Year:
- 2026
- Address:
- Salvador, Brazil
- Editors:
- Marlo Souza, Iria de-Dios-Flores, Diana Santos, Larissa Freitas, Jackson Wilke da Cruz Souza, Eugénio Ribeiro
- Venue:
- PROPOR
- SIG:
- Publisher:
- Association for Computational Linguistics
- Note:
- Pages:
- 161–171
- Language:
- URL:
- https://preview.aclanthology.org/ingest-dnd/2026.propor-1.16/
- DOI:
- Cite (ACL):
- Laís Nuto Rossman, Igor Cataneo Silveira, and Denis Deratani Mauá. 2026. Evaluating Automated Scoring Models on Official ENEM Essays. In Proceedings of the 17th International Conference on Computational Processing of Portuguese (PROPOR 2026) - Vol. 1, pages 161–171, Salvador, Brazil. Association for Computational Linguistics.
- Cite (Informal):
- Evaluating Automated Scoring Models on Official ENEM Essays (Rossman et al., PROPOR 2026)
- PDF:
- https://preview.aclanthology.org/ingest-dnd/2026.propor-1.16.pdf