Abstract
We present an evaluation of Czech low-dimensional distributed word representations, also known as word embeddings. We describe five different approaches to training the models and three different corpora used in training. We evaluate the resulting models on five different datasets, report the results and provide their further analysis.- Anthology ID:
- W19-6107
- Volume:
- Proceedings of the 22nd Nordic Conference on Computational Linguistics
- Month:
- September–October
- Year:
- 2019
- Address:
- Turku, Finland
- Venue:
- NoDaLiDa
- SIG:
- Publisher:
- Linköping University Electronic Press
- Note:
- Pages:
- 65–75
- Language:
- URL:
- https://aclanthology.org/W19-6107
- DOI:
- Cite (ACL):
- Karolína Hořeňovská. 2019. An evaluation of Czech word embeddings. In Proceedings of the 22nd Nordic Conference on Computational Linguistics, pages 65–75, Turku, Finland. Linköping University Electronic Press.
- Cite (Informal):
- An evaluation of Czech word embeddings (Hořeňovská, NoDaLiDa 2019)
- PDF:
- https://preview.aclanthology.org/paclic-22-ingestion/W19-6107.pdf