OdyCy – A general-purpose NLP pipeline for Ancient Greek
Jan Kostkan, Márton Kardos, Jacob Palle Bliddal Mortensen, Kristoffer Laigaard Nielbo
Abstract
This paper presents a general-purpose NLP pipeline that achieves state-of-the-art performance on the Ancient Greek Perseus UD Treebank for several tasks (POS Tagging, Morphological Analysis and Dependency Parsing), and close to state-of-the-art performance on the Proiel UD Treebank. Our aim is to provide a reproducible, open source language processing pipeline for Ancient Greek, capable of handling input texts of varying quality. We measure the performance of our model against other comparable tools and then evaluate lemmatization errors.- Anthology ID:
- 2023.latechclfl-1.14
- Volume:
- Proceedings of the 7th Joint SIGHUM Workshop on Computational Linguistics for Cultural Heritage, Social Sciences, Humanities and Literature
- Month:
- May
- Year:
- 2023
- Address:
- Dubrovnik, Croatia
- Editors:
- Stefania Degaetano-Ortlieb, Anna Kazantseva, Nils Reiter, Stan Szpakowicz
- Venue:
- LaTeCHCLfL
- SIG:
- Publisher:
- Association for Computational Linguistics
- Note:
- Pages:
- 128–134
- Language:
- URL:
- https://aclanthology.org/2023.latechclfl-1.14
- DOI:
- 10.18653/v1/2023.latechclfl-1.14
- Cite (ACL):
- Jan Kostkan, Márton Kardos, Jacob Palle Bliddal Mortensen, and Kristoffer Laigaard Nielbo. 2023. OdyCy – A general-purpose NLP pipeline for Ancient Greek. In Proceedings of the 7th Joint SIGHUM Workshop on Computational Linguistics for Cultural Heritage, Social Sciences, Humanities and Literature, pages 128–134, Dubrovnik, Croatia. Association for Computational Linguistics.
- Cite (Informal):
- OdyCy – A general-purpose NLP pipeline for Ancient Greek (Kostkan et al., LaTeCHCLfL 2023)
- PDF:
- https://preview.aclanthology.org/proper-vol2-ingestion/2023.latechclfl-1.14.pdf