Ria Talsania
2026
Faithfulness Beyond Plausibility: Auditing Human Explanations in Educational Assessment
Ria Talsania | Dhruv Ritesh Shah | Sudhir Dhage
Proceedings of the 64th Annual Meeting of the Association for Computational Linguistics (ACL 2026)
Ria Talsania | Dhruv Ritesh Shah | Sudhir Dhage
Proceedings of the 64th Annual Meeting of the Association for Computational Linguistics (ACL 2026)
When rubric-based feedback tools explain a grade, students and instructors assume those explanations reflect how the score was actually determined. Yet it remains unclear whether explanation components such as rubric assignments and evidence spans reflect how scores are constructed or primarily serve as post-hoc justifications. This gap has direct implications for automated essay scoring and rubric-based feedback tools, where explanation reliability is often assumed but rarely evaluated.We introduce a knowledge graph framework that represents human tutor grading traces as structured objects, enabling controlled counterfactual testing of explanation components. Using 400 grading traces from 10 expert human tutors evaluating 100 narrative essays, we define a reconstruction-based diagnostic to measure how explanation components contribute to score interpretation, independent of prediction. Our results reveal a consistent asymmetry: removing rubric-level information leads to substantial changes in reconstructed scores, while removing evidence spans has minimal impact. This suggests that rubric structure is central to score interpretation, whereas cited evidence spans may function primarily as post-hoc justifications. We further observe tutor-specific variation in grading behavior. These findings highlight the need for explanation mechanisms that better align with scoring processes, ensuring that feedback provided to learners is both interpretable and functionally relevant.