Caroline T. Schroeder
2025
A UD Treebank for Bohairic Coptic
Amir Zeldes
|
Nina Speransky
|
Nicholas E. Wagner
|
Caroline T. Schroeder
Proceedings of the Eighth Workshop on Universal Dependencies (UDW, SyntaxFest 2025)
Despite recent advances in digital resources for other Coptic dialects, especially Sahidic, Bohairic Coptic, the main Coptic dialect for pre-Mamluk, late Byzantine Egypt, and the contemporary language of the Coptic Church, remains critically under-resourced. This paper presents and evaluates the first syntactically annotated corpus of Bohairic Coptic, sampling data from a range of works, including Biblical text, saints’ lives and Christian ascetic writing. We also explore some of the main differences we observe compared to the existing UD treebank of Sahidic Coptic, the classical dialect of the language, and conduct joint and cross-dialect parsing experiments, revealing the unique nature of Bohairic as a related, but distinct variety from the more often studied Sahidic.
2018
A Linked Coptic Dictionary Online
Frank Feder
|
Maxim Kupreyev
|
Emma Manning
|
Caroline T. Schroeder
|
Amir Zeldes
Proceedings of the Second Joint SIGHUM Workshop on Computational Linguistics for Cultural Heritage, Social Sciences, Humanities and Literature
We describe a new project publishing a freely available online dictionary for Coptic. The dictionary encompasses comprehensive cross-referencing mechanisms, including linking entries to an online scanned edition of Crum’s Coptic Dictionary, internal cross-references and etymological information, translated searchable definitions in English, French and German, and linked corpus data which provides frequencies and corpus look-up for headwords and multiword expressions. Headwords are available for linking in external projects using a REST API. We describe the challenges in encoding our dictionary using TEI XML and implementing linking mechanisms to construct a Web interface querying frequency information, which draw on NLP tools to recognize inflected forms in context. We evaluate our dictionary’s coverage using digital corpora of Coptic available online.
2016
An NLP Pipeline for Coptic
Amir Zeldes
|
Caroline T. Schroeder
Proceedings of the 10th SIGHUM Workshop on Language Technology for Cultural Heritage, Social Sciences, and Humanities
Search
Fix author
Co-authors
- Amir Zeldes 3
- Frank Feder 1
- Maxim Kupreyev 1
- Emma Manning 1
- Nina Speransky 1
- show all...