Universal Dependency Treebank for a low-resource Dardic Language: Torwali

Naeem Uddin, Daniel Zeman


Abstract
This paper presents and discusses the linguistic phenomena encountered in the development of the ongoing first ever universal dependency treebank for Torwali the Language. Torwali belongs to the Kohistani sub-group of Dardic Indo-Aryan languages, and is considered an endangered (Moseley, 2010) and indigenous language, which makes it extremely low-resourced in terms of linguistic and computational resources. With the aim of including Torwali in Universal Dependencies (UD) (de Marneffe et al. 2021), we are annotating a diverse set of example sentences for POS tags, features and dependency relations.
Anthology ID:
2025.tlt-1.16
Volume:
Proceedings of the 23rd International Workshop on Treebanks and Linguistic Theories (TLT, SyntaxFest 2025)
Month:
August
Year:
2025
Address:
Ljubljana, Slovenia
Editors:
Sarah Jablotschkin, Sandra Kübler, Heike Zinsmeister
Venues:
TLT | WS | SyntaxFest
SIG:
SIGPARSE
Publisher:
Association for Computational Linguistics
Note:
Pages:
140–147
Language:
URL:
https://preview.aclanthology.org/corrections-2025-08/2025.tlt-1.16/
DOI:
Bibkey:
Cite (ACL):
Naeem Uddin and Daniel Zeman. 2025. Universal Dependency Treebank for a low-resource Dardic Language: Torwali. In Proceedings of the 23rd International Workshop on Treebanks and Linguistic Theories (TLT, SyntaxFest 2025), pages 140–147, Ljubljana, Slovenia. Association for Computational Linguistics.
Cite (Informal):
Universal Dependency Treebank for a low-resource Dardic Language: Torwali (Uddin & Zeman, TLT-SyntaxFest 2025)
Copy Citation:
PDF:
https://preview.aclanthology.org/corrections-2025-08/2025.tlt-1.16.pdf