Scoring with Confidence? – Exploring High-confidence Scoring for Saving Manual Grading Effort
Marie Bexte, Andrea Horbach, Lena Schützler, Oliver Christ, Torsten Zesch
Abstract
A possible way to save manual grading effort in short answer scoring is to automatically score answers for which the classifier is highly confident. We explore the feasibility of this approach in a high-stakes exam setting, evaluating three different similarity-based scoring methods, where the similarity score is a direct proxy for model confidence. The decision on an appropriate level of confidence should ideally be made before scoring a new prompt. We thus probe to what extent confidence thresholds are consistent across different datasets and prompts. We find that high-confidence thresholds vary on a prompt-to-prompt basis, and that the overall potential of increased performance at a reasonable cost of additional manual effort is limited.- Anthology ID:
- 2024.bea-1.11
- Volume:
- Proceedings of the 19th Workshop on Innovative Use of NLP for Building Educational Applications (BEA 2024)
- Month:
- June
- Year:
- 2024
- Address:
- Mexico City, Mexico
- Editors:
- Ekaterina Kochmar, Marie Bexte, Jill Burstein, Andrea Horbach, Ronja Laarmann-Quante, Anaïs Tack, Victoria Yaneva, Zheng Yuan
- Venue:
- BEA
- SIG:
- SIGEDU
- Publisher:
- Association for Computational Linguistics
- Note:
- Pages:
- 119–124
- Language:
- URL:
- https://aclanthology.org/2024.bea-1.11
- DOI:
- Cite (ACL):
- Marie Bexte, Andrea Horbach, Lena Schützler, Oliver Christ, and Torsten Zesch. 2024. Scoring with Confidence? – Exploring High-confidence Scoring for Saving Manual Grading Effort. In Proceedings of the 19th Workshop on Innovative Use of NLP for Building Educational Applications (BEA 2024), pages 119–124, Mexico City, Mexico. Association for Computational Linguistics.
- Cite (Informal):
- Scoring with Confidence? – Exploring High-confidence Scoring for Saving Manual Grading Effort (Bexte et al., BEA 2024)
- PDF:
- https://preview.aclanthology.org/nschneid-patch-5/2024.bea-1.11.pdf