Abstract
We present our work of constructing the first treebank for the Xibe language following the Universal Dependencies (UD) annotation scheme. Xibe is a low-resourced and severely endangered Tungusic language spoken by the Xibe minority living in the Xinjiang Uygur Autonomous Region of China. We collected 810 sentences so far, including 544 sentences from a grammar book on written Xibe and 266 sentences from Cabcal News. We annotated those sentences manually from scratch. In this paper, we report the procedure of building this treebank and analyze several important annotation issues of our treebank. Finally, we propose our plans for future work.- Anthology ID:
- 2020.udw-1.23
- Volume:
- Proceedings of the Fourth Workshop on Universal Dependencies (UDW 2020)
- Month:
- December
- Year:
- 2020
- Address:
- Barcelona, Spain (Online)
- Venue:
- UDW
- SIG:
- Publisher:
- Association for Computational Linguistics
- Note:
- Pages:
- 205–215
- Language:
- URL:
- https://aclanthology.org/2020.udw-1.23
- DOI:
- Cite (ACL):
- He Zhou, Juyeon Chung, Sandra Kübler, and Francis Tyers. 2020. Universal Dependency Treebank for Xibe. In Proceedings of the Fourth Workshop on Universal Dependencies (UDW 2020), pages 205–215, Barcelona, Spain (Online). Association for Computational Linguistics.
- Cite (Informal):
- Universal Dependency Treebank for Xibe (Zhou et al., UDW 2020)
- PDF:
- https://preview.aclanthology.org/remove-xml-comments/2020.udw-1.23.pdf
- Data
- Universal Dependencies