Martin Kappus
2026
Evaluating LLM-based Text Simplification for German: Effects on Post-Editing Effort, Quality Ratings, and User Comprehension
Luisa Carrer | Andreas Säuberli | Martin Kappus | Lukas Fischer | Sarah Ebling
Proceedings of the Fifteenth Language Resources and Evaluation Conference
Luisa Carrer | Andreas Säuberli | Martin Kappus | Lukas Fischer | Sarah Ebling
Proceedings of the Fifteenth Language Resources and Evaluation Conference
Automatic text simplification (ATS) seeks to automate the process of rewording within the same language to enhance readability and comprehension. Current evaluation practices for ATS systems predominantly rely on automatic metrics or assessments by experts and crowdworkers, often excluding the intended end users and other stakeholders, and thus limiting insights into the actual effectiveness of ATS models. In this study, we address this gap by conducting a multi-faceted, mixed-method evaluation of two LLM-based ATS systems for German (capito.ai and GPT-4o) and by involving end users, post-editors, and Easy Language experts. The findings highlight the effectiveness of the LLM-based ATS systems examined across several dimensions, including post-editing efficiency, expert quality assessments, and, in the case of GPT-4o-generated simplifications, user comprehension. Post-editing effort metrics, in particular, show an increase in productivity of around 30% compared to full manual simplification. Moreover, the results reveal substantial differences in perception and understanding among participant groups. These outcomes clearly indicate that ATS for German has recently made considerable progress and, crucially, underscore the importance of incorporating multiple stakeholders into ATS evaluation to better align system performance with accessibility goals.
2025
Proceedings of the 1st Workshop on Artificial Intelligence and Easy and Plain Language in Institutional Contexts (AI & EL/PL)
María Isabel Rivas Ginel | Patrick Cadwell | Paolo Canavese | Silvia Hansen-Schirra | Martin Kappus | Anna Matamala | Will Noonan
Proceedings of the 1st Workshop on Artificial Intelligence and Easy and Plain Language in Institutional Contexts (AI & EL/PL)
María Isabel Rivas Ginel | Patrick Cadwell | Paolo Canavese | Silvia Hansen-Schirra | Martin Kappus | Anna Matamala | Will Noonan
Proceedings of the 1st Workshop on Artificial Intelligence and Easy and Plain Language in Institutional Contexts (AI & EL/PL)
2024
Towards Holistic Human Evaluation of Automatic Text Simplification
Luisa Carrer | Andreas Säuberli | Martin Kappus | Sarah Ebling
Proceedings of the Fourth Workshop on Human Evaluation of NLP Systems (HumEval) @ LREC-COLING 2024
Luisa Carrer | Andreas Säuberli | Martin Kappus | Sarah Ebling
Proceedings of the Fourth Workshop on Human Evaluation of NLP Systems (HumEval) @ LREC-COLING 2024
Text simplification refers to the process of rewording within a single language, moving from a standard form into an easy-to-understand one. Easy Language and Plain Language are two examples of simplified varieties aimed at improving readability and understanding for a wide-ranging audience. Human evaluation of automatic text simplification is usually done by employing experts or crowdworkers to rate the generated texts. However, this approach does not include the target readers of simplified texts and does not reflect actual comprehensibility. In this paper, we explore different ways of measuring the quality of automatically simplified texts. We conducted a multi-faceted evaluation study involving end users, post-editors, and Easy Language experts and applied a variety of qualitative and quantitative methods. We found differences in the perception and actual comprehension of the texts by different user groups. In addition, qualitative surveys and behavioral observations proved to be essential in interpreting the results.