Distributional Semantics for Neo-Latin
Jelke Bloem, Maria Chiara Parisi, Martin Reynaert, Yvette Oortwijn, Arianna Betti
Abstract
We address the problem of creating and evaluating quality Neo-Latin word embeddings for the purpose of philosophical research, adapting the Nonce2Vec tool to learn embeddings from Neo-Latin sentences. This distributional semantic modeling tool can learn from tiny data incrementally, using a larger background corpus for initialization. We conduct two evaluation tasks: definitional learning of Latin Wikipedia terms, and learning consistent embeddings from 18th century Neo-Latin sentences pertaining to the concept of mathematical method. Our results show that consistent Neo-Latin word embeddings can be learned from this type of data. While our evaluation results are promising, they do not reveal to what extent the learned models match domain expert knowledge of our Neo-Latin texts. Therefore, we propose an additional evaluation method, grounded in expert-annotated data, that would assess whether learned representations are conceptually sound in relation to the domain of study.- Anthology ID:
- 2020.lt4hala-1.13
- Volume:
- Proceedings of LT4HALA 2020 - 1st Workshop on Language Technologies for Historical and Ancient Languages
- Month:
- May
- Year:
- 2020
- Address:
- Marseille, France
- Editors:
- Rachele Sprugnoli, Marco Passarotti
- Venue:
- LT4HALA
- SIG:
- Publisher:
- European Language Resources Association (ELRA)
- Note:
- Pages:
- 84–93
- Language:
- English
- URL:
- https://aclanthology.org/2020.lt4hala-1.13
- DOI:
- Cite (ACL):
- Jelke Bloem, Maria Chiara Parisi, Martin Reynaert, Yvette Oortwijn, and Arianna Betti. 2020. Distributional Semantics for Neo-Latin. In Proceedings of LT4HALA 2020 - 1st Workshop on Language Technologies for Historical and Ancient Languages, pages 84–93, Marseille, France. European Language Resources Association (ELRA).
- Cite (Informal):
- Distributional Semantics for Neo-Latin (Bloem et al., LT4HALA 2020)
- PDF:
- https://preview.aclanthology.org/nschneid-patch-4/2020.lt4hala-1.13.pdf