Lexical Relation Mining in Neural Word Embeddings

Aishwarya Jadhav, Yifat Amir, Zachary Pardos


Abstract
Work with neural word embeddings and lexical relations has largely focused on confirmatory experiments which use human-curated examples of semantic and syntactic relations to validate against. In this paper, we explore the degree to which lexical relations, such as those found in popular validation sets, can be derived and extended from a variety of neural embeddings using classical clustering methods. We show that the Word2Vec space of word-pairs (i.e., offset vectors) significantly outperforms other more contemporary methods, even in the presence of a large number of noisy offsets. Moreover, we show that via a simple nearest neighbor approach in the offset space, new examples of known relations can be discovered. Our results speak to the amenability of offset vectors from non-contextual neural embeddings to find semantically coherent clusters. This simple approach has implications for the exploration of emergent regularities and their examples, such as emerging trends on social media and their related posts.
Anthology ID:
2020.coling-main.112
Volume:
Proceedings of the 28th International Conference on Computational Linguistics
Month:
December
Year:
2020
Address:
Barcelona, Spain (Online)
Venue:
COLING
SIG:
Publisher:
International Committee on Computational Linguistics
Note:
Pages:
1299–1311
Language:
URL:
https://aclanthology.org/2020.coling-main.112
DOI:
10.18653/v1/2020.coling-main.112
Bibkey:
Cite (ACL):
Aishwarya Jadhav, Yifat Amir, and Zachary Pardos. 2020. Lexical Relation Mining in Neural Word Embeddings. In Proceedings of the 28th International Conference on Computational Linguistics, pages 1299–1311, Barcelona, Spain (Online). International Committee on Computational Linguistics.
Cite (Informal):
Lexical Relation Mining in Neural Word Embeddings (Jadhav et al., COLING 2020)
Copy Citation:
PDF:
https://preview.aclanthology.org/update-css-js/2020.coling-main.112.pdf