Beyond the Hint: Using Self-Critique to Constrain LLM Feedback in Conversation-Based Assessment

Tyler Burleigh; Jenny Han; Kristen Dicerbo

Beyond the Hint: Using Self-Critique to Constrain LLM Feedback in Conversation-Based Assessment

Tyler Burleigh, Jenny Han, Kristen Dicerbo

Abstract

Large Language Models in Conversation-Based Assessment tend to provide inappropriate hints that compromise validity. We demonstrate that self-critique – a simple prompt engineering technique – effectively constrains this behavior.Through two studies using synthetic conversations and real-world high school math pilot data, self-critique reduced inappropriate hints by 90.7% and 24-75% respectively. Human experts validated ground truth labels while LLM judges enabled scale. This immediately deployable solution addresses the critical tension in intermediate-stakes assessment: maintaining student engagement while ensuring fair comparisons. Our findings show prompt engineering can meaningfully safeguard assessment integrity without model fine-tuning.

Anthology ID:: 2025.aimecon-sessions.9
Volume:: Proceedings of the Artificial Intelligence in Measurement and Education Conference (AIME-Con): Coordinated Session Papers
Month:: October
Year:: 2025
Address:: Wyndham Grand Pittsburgh, Downtown, Pittsburgh, Pennsylvania, United States
Editors:: Joshua Wilson, Christopher Ormerod, Magdalen Beiting Parrish
Venue:: AIME-Con
SIG:
Publisher:: National Council on Measurement in Education (NCME)
Note:
Pages:: 79–85
Language:
URL:: https://preview.aclanthology.org/ingest-emnlp/2025.aimecon-sessions.9/
DOI:
Bibkey:
Cite (ACL):: Tyler Burleigh, Jenny Han, and Kristen Dicerbo. 2025. Beyond the Hint: Using Self-Critique to Constrain LLM Feedback in Conversation-Based Assessment. In Proceedings of the Artificial Intelligence in Measurement and Education Conference (AIME-Con): Coordinated Session Papers, pages 79–85, Wyndham Grand Pittsburgh, Downtown, Pittsburgh, Pennsylvania, United States. National Council on Measurement in Education (NCME).
Cite (Informal):: Beyond the Hint: Using Self-Critique to Constrain LLM Feedback in Conversation-Based Assessment (Burleigh et al., AIME-Con 2025)
Copy Citation:
PDF:: https://preview.aclanthology.org/ingest-emnlp/2025.aimecon-sessions.9.pdf

PDF Cite Search Fix data