Anastasia Drackert


2025

pdf bib
Automated Scoring of a German Written Elicited Imitation Test
Mihail Chifligarov | Jammila Laâguidi | Max Schellenberg | Alexander Dill | Anna Timukova | Anastasia Drackert | Ronja Laarmann-Quante
Proceedings of the 20th Workshop on Innovative Use of NLP for Building Educational Applications (BEA 2025)

We present an approach to the automated scoring of a German Written Elicited Imitation Test, designed to assess literacy-dependent procedural knowledge in German as a foreign language. In this test, sentences are briefly displayed on a screen and, after a short pause, test-takers are asked to reproduce the sentence in writing as accurately as possible. Responses are rated on a 5-point ordinal scale, with grammatical errors typically penalized more heavily than lexical deviations. We compare a rule-based model that implements the categories of the scoring rubric through hand-crafted rules, and a deep learning model trained on pairs of stimulus sentences and written responses. Both models achieve promising performance with quadratically weighted kappa (QWK) values around .87. However, their strengths differ – the rule-based model performs better on previously unseen stimulus sentences and at the extremes of the rating scale, while the deep learning model shows advantages in scoring mid-range responses, for which explicit rules are harder to define.