Interaction of Semantics and Morphology in Russian Word Vectors

Yulia Zinova, Ruben van de Vijver, Anastasia Yablokova


Abstract
In this paper we explore how morphological information can be extracted from fastText embeddings for Russian nouns. We investigate the negative effects of syncretism and propose ways of modifying the vectors that can help to find better representations for morphological functions and thus for out of vocabulary words. In particular, we look at the effect of analysing shift vectors instead of original vectors, discuss various possibilities of finding base forms to create shift vectors, and show that using only the high frequency data is beneficial when looking for structure with respect to the morphosyntactic functions in the embeddings.
Anthology ID:
2024.cogalex-1.14
Volume:
Proceedings of the Workshop on Cognitive Aspects of the Lexicon @ LREC-COLING 2024
Month:
May
Year:
2024
Address:
Torino, Italia
Editors:
Michael Zock, Emmanuele Chersoni, Yu-Yin Hsu, Simon de Deyne
Venue:
CogALex
SIG:
Publisher:
ELRA and ICCL
Note:
Pages:
120–128
Language:
URL:
https://aclanthology.org/2024.cogalex-1.14
DOI:
Bibkey:
Cite (ACL):
Yulia Zinova, Ruben van de Vijver, and Anastasia Yablokova. 2024. Interaction of Semantics and Morphology in Russian Word Vectors. In Proceedings of the Workshop on Cognitive Aspects of the Lexicon @ LREC-COLING 2024, pages 120–128, Torino, Italia. ELRA and ICCL.
Cite (Informal):
Interaction of Semantics and Morphology in Russian Word Vectors (Zinova et al., CogALex 2024)
Copy Citation:
PDF:
https://preview.aclanthology.org/add_acl24_videos/2024.cogalex-1.14.pdf