Subjective Text Complexity Assessment for German

Laura Seiffe, Fares Kallel, Sebastian Möller, Babak Naderi, Roland Roller


Abstract
For different reasons, text can be difficult to read and understand for many people, especially if the text’s language is too complex. In order to provide suitable text for the target audience, it is necessary to measure its complexity. In this paper we describe subjective experiments to assess the readability of German text. We compile a new corpus of sentences provided by a German IT service provider. The sentences are annotated with the subjective complexity ratings by two groups of participants, namely experts and non-experts for that text domain. We then extract an extensive set of linguistically motivated features that are supposedly interacting with complexity perception. We show that a linear regression model with a subset of these features can be a very good predictor of text complexity.
Anthology ID:
2022.lrec-1.74
Volume:
Proceedings of the Thirteenth Language Resources and Evaluation Conference
Month:
June
Year:
2022
Address:
Marseille, France
Venue:
LREC
SIG:
Publisher:
European Language Resources Association
Note:
Pages:
707–714
Language:
URL:
https://aclanthology.org/2022.lrec-1.74
DOI:
Bibkey:
Cite (ACL):
Laura Seiffe, Fares Kallel, Sebastian Möller, Babak Naderi, and Roland Roller. 2022. Subjective Text Complexity Assessment for German. In Proceedings of the Thirteenth Language Resources and Evaluation Conference, pages 707–714, Marseille, France. European Language Resources Association.
Cite (Informal):
Subjective Text Complexity Assessment for German (Seiffe et al., LREC 2022)
Copy Citation:
PDF:
https://preview.aclanthology.org/emnlp-22-attachments/2022.lrec-1.74.pdf