Universal Dependency Treebank for Xibe

He Zhou, Juyeon Chung, Sandra Kübler, Francis Tyers


Abstract
We present our work of constructing the first treebank for the Xibe language following the Universal Dependencies (UD) annotation scheme. Xibe is a low-resourced and severely endangered Tungusic language spoken by the Xibe minority living in the Xinjiang Uygur Autonomous Region of China. We collected 810 sentences so far, including 544 sentences from a grammar book on written Xibe and 266 sentences from Cabcal News. We annotated those sentences manually from scratch. In this paper, we report the procedure of building this treebank and analyze several important annotation issues of our treebank. Finally, we propose our plans for future work.
Anthology ID:
2020.udw-1.23
Volume:
Proceedings of the Fourth Workshop on Universal Dependencies (UDW 2020)
Month:
December
Year:
2020
Address:
Barcelona, Spain (Online)
Editors:
Marie-Catherine de Marneffe, Miryam de Lhoneux, Joakim Nivre, Sebastian Schuster
Venue:
UDW
SIG:
Publisher:
Association for Computational Linguistics
Note:
Pages:
205–215
Language:
URL:
https://aclanthology.org/2020.udw-1.23
DOI:
Bibkey:
Cite (ACL):
He Zhou, Juyeon Chung, Sandra Kübler, and Francis Tyers. 2020. Universal Dependency Treebank for Xibe. In Proceedings of the Fourth Workshop on Universal Dependencies (UDW 2020), pages 205–215, Barcelona, Spain (Online). Association for Computational Linguistics.
Cite (Informal):
Universal Dependency Treebank for Xibe (Zhou et al., UDW 2020)
Copy Citation:
PDF:
https://preview.aclanthology.org/naacl24-info/2020.udw-1.23.pdf
Data
Universal Dependencies