The Universal Dependencies Treebank for Slovenian

Kaja Dobrovoljc, Tomaž Erjavec, Simon Krek


Abstract
This paper introduces the Universal Dependencies Treebank for Slovenian. We overview the existing dependency treebanks for Slovenian and then detail the conversion of the ssj200k treebank to the framework of Universal Dependencies version 2. We explain the mapping of part-of-speech categories, morphosyntactic features, and the dependency relations, focusing on the more problematic language-specific issues. We conclude with a quantitative overview of the treebank and directions for further work.
Anthology ID:
W17-1406
Volume:
Proceedings of the 6th Workshop on Balto-Slavic Natural Language Processing
Month:
April
Year:
2017
Address:
Valencia, Spain
Editors:
Tomaž Erjavec, Jakub Piskorski, Lidia Pivovarova, Jan Šnajder, Josef Steinberger, Roman Yangarber
Venue:
BSNLP
SIG:
SIGSLAV
Publisher:
Association for Computational Linguistics
Note:
Pages:
33–38
Language:
URL:
https://aclanthology.org/W17-1406
DOI:
10.18653/v1/W17-1406
Bibkey:
Cite (ACL):
Kaja Dobrovoljc, Tomaž Erjavec, and Simon Krek. 2017. The Universal Dependencies Treebank for Slovenian. In Proceedings of the 6th Workshop on Balto-Slavic Natural Language Processing, pages 33–38, Valencia, Spain. Association for Computational Linguistics.
Cite (Informal):
The Universal Dependencies Treebank for Slovenian (Dobrovoljc et al., BSNLP 2017)
Copy Citation:
PDF:
https://preview.aclanthology.org/ingest-bitext-workshop/W17-1406.pdf
Data
MULTEXT-EastUniversal Dependencies