Towards the Morphological Annotation of North Markian (Low German)

Christian Chiarcos


Abstract
Low German (Low Saxon, ISO 639-2 nds) is an underresourced West Germanic language spoken in Northern Germany (Plattdütsch), in the Netherlands (Nedersaksisch) and in an international diaspora (Plautdietsch, Pomerano, etc.). As a minority language, it is under pressure from the respective national languages, and considered threatened. Although NLP and digital language resources might play a role in facilitating the use of the language on the web and to support intergenerational transmission, no NLP tools are known to exist, and no adequate corpora that such tools could be trained on. This paper describes the construction of a novel corpus of North Markian, a dialect of East Low German, its morphosyntactic annotation and morphological analysis, and in particular explores methods to bootstrap and develop such resources in the face of a complete lack of training data.
Anthology ID:
2026.lrec-main.916
Volume:
Proceedings of the Fifteenth Language Resources and Evaluation Conference
Month:
May
Year:
2026
Address:
Palma de Mallorca, Spain
Editors:
Stelios Piperidis, Núria Bel, Henk van den Heuvel, Nancy Ide, Simon Krek, Antonio Toral
Venue:
LREC
SIG:
Publisher:
ELRA Language Resource Association
Note:
Pages:
11699–11714
Language:
URL:
https://preview.aclanthology.org/ingest-lrec/2026.lrec-main.916/
DOI:
Bibkey:
Cite (ACL):
Christian Chiarcos. 2026. Towards the Morphological Annotation of North Markian (Low German). International Conference on Language Resources and Evaluation, main:11699–11714.
Cite (Informal):
Towards the Morphological Annotation of North Markian (Low German) (Chiarcos, LREC 2026)
Copy Citation:
PDF:
https://preview.aclanthology.org/ingest-lrec/2026.lrec-main.916.pdf