Rigardt Pretorius


2024

pdf
The First Universal Dependency Treebank for Tswana: Tswana-Popapolelo
Tanja Gaustad | Ansu Berg | Rigardt Pretorius | Roald Eiselen
Proceedings of the Fifth Workshop on Resources for African Indigenous Languages @ LREC-COLING 2024

This paper presents the first publicly available UD treebank for Tswana, Tswana-Popapolelo. The data used consists of the 20 Cairo CICLing sentences translated to Tswana. After pre-processing these sentences with detailed POS (XPOS) and converting them to universal POS (UPOS), we proceeded to annotate the data with dependency relations, documenting decisions for the language specific constructions. Linguistic issues encountered are described in detail as this is the first application of the UD framework to produce a dependency treebank for the Bantu language family in general and for Tswana specifically.

2009

pdf
Setswana Tokenisation and Computational Verb Morphology: Facing the Challenge of a Disjunctive Orthography
Rigardt Pretorius | Ansu Berg | Laurette Pretorius | Biffie Viljoen
Proceedings of the First Workshop on Language Technologies for African Languages