A New Hebrew Universal Dependency Treebank: The First Treebank of Post-Rabbinic Historical Hebrew

Rachel Tal, Shlomit Fuchs, Orly Albeck, Elisheva Brauner, Yitzchak Lindenbaum, Ephraim Meiri, Avi Shmidman


Abstract
The corpus of post-Rabbinic historical Hebrew is a foundational corpus of Jewish heritage, containing over a billion words of legal, hermeneutical, and philosophic texts (and more). However, because the linguistic norms of the corpus diverge so often from that of modern Hebrew, the corpus cannot be computationally analyzed with existing Hebrew parsers. In order to fill this lacuna, we present the first Universal Dependencies corpus of post-Rabbinic historical Hebrew. The corpus comprises over 11,800 words, and we are pleased to release it to the community.
Anthology ID:
2025.tlt-1.11
Volume:
Proceedings of the 23rd International Workshop on Treebanks and Linguistic Theories (TLT, SyntaxFest 2025)
Month:
August
Year:
2025
Address:
Ljubljana, Slovenia
Editors:
Sarah Jablotschkin, Sandra Kübler, Heike Zinsmeister
Venues:
TLT | WS | SyntaxFest
SIG:
SIGPARSE
Publisher:
Association for Computational Linguistics
Note:
Pages:
91–96
Language:
URL:
https://preview.aclanthology.org/transition-to-people-yaml/2025.tlt-1.11/
DOI:
Bibkey:
Cite (ACL):
Rachel Tal, Shlomit Fuchs, Orly Albeck, Elisheva Brauner, Yitzchak Lindenbaum, Ephraim Meiri, and Avi Shmidman. 2025. A New Hebrew Universal Dependency Treebank: The First Treebank of Post-Rabbinic Historical Hebrew. In Proceedings of the 23rd International Workshop on Treebanks and Linguistic Theories (TLT, SyntaxFest 2025), pages 91–96, Ljubljana, Slovenia. Association for Computational Linguistics.
Cite (Informal):
A New Hebrew Universal Dependency Treebank: The First Treebank of Post-Rabbinic Historical Hebrew (Tal et al., TLT-SyntaxFest 2025)
Copy Citation:
PDF:
https://preview.aclanthology.org/transition-to-people-yaml/2025.tlt-1.11.pdf