Job Sylvanus


2025

pdf bib
Universal Dependencies Treebank for Khoekhoe (KDT)
Tulchynska Kira | Job Sylvanus | Witzlack-Makarevich Alena
Proceedings of the Eighth Workshop on Universal Dependencies (UDW, SyntaxFest 2025)

This paper reports on the development of the first dependency treebank for Khoekhoe (KDT). Khoekhoe (Khoe-Kwadi, Namibia) is a low-resource language with few linguistic and computational resources available publicly. This treebank consists of 29k words across six texts taken from various registers. It includes a substantial portion of spoken conversational data. These sentences were annotated manually according to the Universal Dependencies framework. In this paper, apart from presenting the strategies that have been followed to create the treebank, we also discussed some challenging morphological features and syntactic constructions found in the corpus and outlined how we have handled them using the current Universal Dependencies specification.