Universal Dependencies for the Alemannic Alsatian Dialects

Barbara Hoff, Nathanaël Beiner, Delphine Bernhard


Abstract
We present the first corpus of Alsatian Alemannic dialects following Universal Dependencies (UD) guidelines, a project which already covers many of the world’s languages. Standard languages are represented to a greater extent than non-standard varieties in UD, and our corpus contributes to closing the gap in the lack of resources for Alsatian dialects by presenting the first UD treebank for these dialects, which are spoken in Northeastern France. Our corpus is annotated both with part-of-speech tags and dependency information, as well as French glosses and German lemmas, containing in total 975 sentences and 19,286 tokens, spanning over various text genres. In this article, we present our data, details of the annotation process, as well as some specific syntactic phenomena which differentiate and situate Alsatian with regards to both Standard German and some other German non-standard varieties. The addition of this corpus to the UD project allows for a higher visibility of the Alemannic Alsatian dialects in linguistic research, and provides a valuable resource for research in many fields, including NLP, syntax and comparative Germanic linguistics.
Anthology ID:
2025.tlt-1.2
Volume:
Proceedings of the 23rd International Workshop on Treebanks and Linguistic Theories (TLT, SyntaxFest 2025)
Month:
August
Year:
2025
Address:
Ljubljana, Slovenia
Editors:
Sarah Jablotschkin, Sandra Kübler, Heike Zinsmeister
Venues:
TLT | WS | SyntaxFest
SIG:
SIGPARSE
Publisher:
Association for Computational Linguistics
Note:
Pages:
10–22
Language:
URL:
https://preview.aclanthology.org/mtsummit-25-ingestion/2025.tlt-1.2/
DOI:
Bibkey:
Cite (ACL):
Barbara Hoff, Nathanaël Beiner, and Delphine Bernhard. 2025. Universal Dependencies for the Alemannic Alsatian Dialects. In Proceedings of the 23rd International Workshop on Treebanks and Linguistic Theories (TLT, SyntaxFest 2025), pages 10–22, Ljubljana, Slovenia. Association for Computational Linguistics.
Cite (Informal):
Universal Dependencies for the Alemannic Alsatian Dialects (Hoff et al., TLT-SyntaxFest 2025)
Copy Citation:
PDF:
https://preview.aclanthology.org/mtsummit-25-ingestion/2025.tlt-1.2.pdf