Universal Dependencies for Suansu

Jessica K. Ivani, Kira Tulchynska


Abstract
This contribution presents the Naga-Suansu Universal Dependencies (UD) treebank, the first resource of this kind for Suansu, an endangered and underdocumented Tibeto-Burman language spoken in Northeast India. This treebank follows the UD annotation framework. We describe the corpus composition, data sources, and annotation process, outlining the general structure of the treebank. In addition, we highlight morphosyntactic challenges where Suansu grammar does not fit neatly into the UD annotation schema and propose adaptations to better capture its structural properties. As the first Tibeto-Burman language included in the UD project, the Naga-Suansu treebank serves several purposes: it contributes to the documentation and preservation of endangered languages, enables the understanding of cross-linguistic variation, and supports future research efforts in refining UD annotation practices for South and Southeast Asian languages.
Anthology ID:
2025.udw-1.4
Volume:
Proceedings of the Eighth Workshop on Universal Dependencies (UDW, SyntaxFest 2025)
Month:
August
Year:
2025
Address:
Ljubljana, Slovenia
Editors:
Gosse Bomma, Çağrı Çöltekin
Venues:
UDW | WS | SyntaxFest
SIG:
SIGPARSE
Publisher:
Association for Computational Linguistics
Note:
Pages:
30–38
Language:
URL:
https://preview.aclanthology.org/mtsummit-25-ingestion/2025.udw-1.4/
DOI:
Bibkey:
Cite (ACL):
Jessica K. Ivani and Kira Tulchynska. 2025. Universal Dependencies for Suansu. In Proceedings of the Eighth Workshop on Universal Dependencies (UDW, SyntaxFest 2025), pages 30–38, Ljubljana, Slovenia. Association for Computational Linguistics.
Cite (Informal):
Universal Dependencies for Suansu (Ivani & Tulchynska, UDW-SyntaxFest 2025)
Copy Citation:
PDF:
https://preview.aclanthology.org/mtsummit-25-ingestion/2025.udw-1.4.pdf