Building a Universal Dependencies Treebank for a Polysynthetic Language: the Case of Abaza

Alexey Koshevoy, Anastasia Panova, Ilya Makarchuk


Abstract
In this paper, we discuss the challenges that we faced during the construction of a Universal Dependencies treebank for Abaza, a polysynthetic Northwest Caucasian language. We propose an alternative to the morpheme-level annotation of polysynthetic languages introduced in Park et al. (2021). Our approach aims at reducing the number of morphological features, yet providing all the necessary information for the comprehensive representation of all the syntactic relations. Besides, we suggest to add one language-specific relation needed for annotating repetitions in spoken texts and present several solutions that aim at increasing cross-linguistic comparability of our data.
Anthology ID:
2023.udw-1.1
Volume:
Proceedings of the Sixth Workshop on Universal Dependencies (UDW, GURT/SyntaxFest 2023)
Month:
March
Year:
2023
Address:
Washington, D.C.
Editors:
Loïc Grobol, Francis Tyers
Venues:
UDW | SyntaxFest
SIG:
SIGPARSE
Publisher:
Association for Computational Linguistics
Note:
Pages:
1–6
Language:
URL:
https://aclanthology.org/2023.udw-1.1
DOI:
Bibkey:
Cite (ACL):
Alexey Koshevoy, Anastasia Panova, and Ilya Makarchuk. 2023. Building a Universal Dependencies Treebank for a Polysynthetic Language: the Case of Abaza. In Proceedings of the Sixth Workshop on Universal Dependencies (UDW, GURT/SyntaxFest 2023), pages 1–6, Washington, D.C.. Association for Computational Linguistics.
Cite (Informal):
Building a Universal Dependencies Treebank for a Polysynthetic Language: the Case of Abaza (Koshevoy et al., UDW-SyntaxFest 2023)
Copy Citation:
PDF:
https://preview.aclanthology.org/ingest-acl-2023-videos/2023.udw-1.1.pdf
Video:
 https://preview.aclanthology.org/ingest-acl-2023-videos/2023.udw-1.1.mov