Abstract
This paper introduces the Universal Dependencies Treebank for Slovenian. We overview the existing dependency treebanks for Slovenian and then detail the conversion of the ssj200k treebank to the framework of Universal Dependencies version 2. We explain the mapping of part-of-speech categories, morphosyntactic features, and the dependency relations, focusing on the more problematic language-specific issues. We conclude with a quantitative overview of the treebank and directions for further work.- Anthology ID:
- W17-1406
- Volume:
- Proceedings of the 6th Workshop on Balto-Slavic Natural Language Processing
- Month:
- April
- Year:
- 2017
- Address:
- Valencia, Spain
- Editors:
- Tomaž Erjavec, Jakub Piskorski, Lidia Pivovarova, Jan Šnajder, Josef Steinberger, Roman Yangarber
- Venue:
- BSNLP
- SIG:
- SIGSLAV
- Publisher:
- Association for Computational Linguistics
- Note:
- Pages:
- 33–38
- Language:
- URL:
- https://aclanthology.org/W17-1406
- DOI:
- 10.18653/v1/W17-1406
- Cite (ACL):
- Kaja Dobrovoljc, Tomaž Erjavec, and Simon Krek. 2017. The Universal Dependencies Treebank for Slovenian. In Proceedings of the 6th Workshop on Balto-Slavic Natural Language Processing, pages 33–38, Valencia, Spain. Association for Computational Linguistics.
- Cite (Informal):
- The Universal Dependencies Treebank for Slovenian (Dobrovoljc et al., BSNLP 2017)
- PDF:
- https://preview.aclanthology.org/ingest-bitext-workshop/W17-1406.pdf
- Data
- MULTEXT-East, Universal Dependencies