Evaluating Sampling Strategies for Similarity-Based Short Answer Scoring: a Case Study in Thailand
Pachara Boonsarngsuk, Pacharapon Arpanantikul, Supakorn Hiranwipas, Wipu Watcharakajorn, Ekapol Chuangsuwanich
Abstract
Automatic short answer scoring is a task whose aim is to help grade written works by learners of some subject matter. In niche subject domains with small examples, existing methods primarily utilized similarity-based scoring, relying on predefined reference answers to grade each student’s answer based on the similarity to the reference. However, these reference answers are often generated from a randomly selected set of graded student answer, which may fail to represent the full range of scoring variations. We propose a semi-automatic scoring framework that enhances the selective sampling strategy for defining the reference answers through a K-center-based and a K-means-based sampling method. Our results demonstrate that our framework outperforms previous similarity-based scoring methods on a dataset with Thai and English. Moreover, it achieves competitive performance compared to human reference performance and LLMs.- Anthology ID:
- 2025.sealp-1.3
- Volume:
- Proceedings of the Second Workshop in South East Asian Language Processing
- Month:
- January
- Year:
- 2025
- Address:
- Online
- Editors:
- Derry Wijaya, Alham Fikri Aji, Clara Vania, Genta Indra Winata, Ayu Purwarianti
- Venues:
- sealp | WS
- SIG:
- Publisher:
- Association for Computational Linguistics
- Note:
- Pages:
- 27–41
- Language:
- URL:
- https://preview.aclanthology.org/fix-sig-urls/2025.sealp-1.3/
- DOI:
- Cite (ACL):
- Pachara Boonsarngsuk, Pacharapon Arpanantikul, Supakorn Hiranwipas, Wipu Watcharakajorn, and Ekapol Chuangsuwanich. 2025. Evaluating Sampling Strategies for Similarity-Based Short Answer Scoring: a Case Study in Thailand. In Proceedings of the Second Workshop in South East Asian Language Processing, pages 27–41, Online. Association for Computational Linguistics.
- Cite (Informal):
- Evaluating Sampling Strategies for Similarity-Based Short Answer Scoring: a Case Study in Thailand (Boonsarngsuk et al., sealp 2025)
- PDF:
- https://preview.aclanthology.org/fix-sig-urls/2025.sealp-1.3.pdf