Universal Dependencies: Extensions for Modern and Historical German

Stefanie Dipper, Cora Haiber, Anna Maria Schröter, Alexandra Wiemann, Maike Brinkschulte


Abstract
In this paper we present extensions of the UD scheme for modern and historical German. The extensions relate in part to fundamental differences such as those between different kinds of arguments and modifiers. We illustrate the extensions with examples from the MHG data and discuss a number of MHG-specific constructions. At the current time, we have annotated a corpus of Middle High German with almost 29K tokens using this scheme, which to our knowledge is the first UD treebank for Middle High German. Inter-annotator agreement is very high: the annotators achieve a score of α = 0.85. A statistical analysis of the annotations shows some interesting differences in the distribution of labels between modern and historical German.
Anthology ID:
2024.lrec-main.1485
Volume:
Proceedings of the 2024 Joint International Conference on Computational Linguistics, Language Resources and Evaluation (LREC-COLING 2024)
Month:
May
Year:
2024
Address:
Torino, Italia
Editors:
Nicoletta Calzolari, Min-Yen Kan, Veronique Hoste, Alessandro Lenci, Sakriani Sakti, Nianwen Xue
Venues:
LREC | COLING
SIG:
Publisher:
ELRA and ICCL
Note:
Pages:
17101–17111
Language:
URL:
https://aclanthology.org/2024.lrec-main.1485
DOI:
Bibkey:
Cite (ACL):
Stefanie Dipper, Cora Haiber, Anna Maria Schröter, Alexandra Wiemann, and Maike Brinkschulte. 2024. Universal Dependencies: Extensions for Modern and Historical German. In Proceedings of the 2024 Joint International Conference on Computational Linguistics, Language Resources and Evaluation (LREC-COLING 2024), pages 17101–17111, Torino, Italia. ELRA and ICCL.
Cite (Informal):
Universal Dependencies: Extensions for Modern and Historical German (Dipper et al., LREC-COLING 2024)
Copy Citation:
PDF:
https://preview.aclanthology.org/add_acl24_videos/2024.lrec-main.1485.pdf