Lars Ingver Höft


2024

pdf
DARIUS: A Comprehensive Learner Corpus for Argument Mining in German-Language Essays
Nils-Jonathan Schaller | Andrea Horbach | Lars Ingver Höft | Yuning Ding | Jan Luca Bahr | Jennifer Meyer | Thorben Jansen
Proceedings of the 2024 Joint International Conference on Computational Linguistics, Language Resources and Evaluation (LREC-COLING 2024)

In this paper, we present the DARIUS (Digital Argumentation Instruction for Science) corpus for argumentation quality on 4589 essays written by 1839 German secondary school students. The corpus is annotated according to a fine-grained annotation scheme, ranging from a broader perspective like content zones, to more granular features like argumentation coverage/reach and argumentative discourse units like claims and warrants. The features have inter-annotator agreements up to 0.83 Krippendorff’s α. The corpus and dataset are publicly available for further research in argument mining.