Denise Loefflad


2026

This paper presents an empirical evaluation of the German Grammar Profile (GGP), a CEFR-aligned resource of criterial features, and its corresponding extraction system PALME. We design a systematic test suite in which each feature extractor is evaluated on controlled positive and negative examples. The results show that PALME achieves high precision and recall across all CEFR levels, with over 90% of features achieving scores above 0.8. Qualitative analysis shows that lower performance primarily results from morphological ambiguity in noun and adjective case marking. To evaluate the usefulness of the criterial features of the GGP for CEFR-aligned readability assessment, we assess their predictive power using Explainable Boosting Machines on graded readers. The model achieves strong performance (precision: 0.75, recall: 0.73). Our qualitative analysis shows that features related to specific verb constructions follow patterns consistent with developmental stages predicted by Processability Theory. These findings underline the value and relevance of criterial features for modeling language development in readability assessment.