Evaluation of Greek Word Embeddings
Stamatis Outsios, Christos Karatsalos, Konstantinos Skianis, Michalis Vazirgiannis
Abstract
Since word embeddings have been the most popular input for many NLP tasks, evaluating their quality is critical. Most research efforts are focusing on English word embeddings. This paper addresses the problem of training and evaluating such models for the Greek language. We present a new word analogy test set considering the original English Word2vec analogy test set and some specific linguistic aspects of the Greek language as well. Moreover, we create a Greek version of WordSim353 test collection for a basic evaluation of word similarities. Produced resources are available for download. We test seven word vector models and our evaluation shows that we are able to create meaningful representations. Last, we discover that the morphological complexity of the Greek language and polysemy can influence the quality of the resulting word embeddings.- Anthology ID:
- 2020.lrec-1.310
- Volume:
- Proceedings of the Twelfth Language Resources and Evaluation Conference
- Month:
- May
- Year:
- 2020
- Address:
- Marseille, France
- Editors:
- Nicoletta Calzolari, Frédéric Béchet, Philippe Blache, Khalid Choukri, Christopher Cieri, Thierry Declerck, Sara Goggi, Hitoshi Isahara, Bente Maegaard, Joseph Mariani, Hélène Mazo, Asuncion Moreno, Jan Odijk, Stelios Piperidis
- Venue:
- LREC
- SIG:
- Publisher:
- European Language Resources Association
- Note:
- Pages:
- 2543–2551
- Language:
- English
- URL:
- https://aclanthology.org/2020.lrec-1.310
- DOI:
- Cite (ACL):
- Stamatis Outsios, Christos Karatsalos, Konstantinos Skianis, and Michalis Vazirgiannis. 2020. Evaluation of Greek Word Embeddings. In Proceedings of the Twelfth Language Resources and Evaluation Conference, pages 2543–2551, Marseille, France. European Language Resources Association.
- Cite (Informal):
- Evaluation of Greek Word Embeddings (Outsios et al., LREC 2020)
- PDF:
- https://preview.aclanthology.org/ingest-acl-2023-videos/2020.lrec-1.310.pdf