Word Embeddings as Tuples of Feature Probabilities
Siddharth Bhat, Alok Debnath, Souvik Banerjee, Manish Shrivastava
Abstract
In this paper, we provide an alternate perspective on word representations, by reinterpreting the dimensions of the vector space of a word embedding as a collection of features. In this reinterpretation, every component of the word vector is normalized against all the word vectors in the vocabulary. This idea now allows us to view each vector as an n-tuple (akin to a fuzzy set), where n is the dimensionality of the word representation and each element represents the probability of the word possessing a feature. Indeed, this representation enables the use fuzzy set theoretic operations, such as union, intersection and difference. Unlike previous attempts, we show that this representation of words provides a notion of similarity which is inherently asymmetric and hence closer to human similarity judgements. We compare the performance of this representation with various benchmarks, and explore some of the unique properties including function word detection, detection of polysemous words, and some insight into the interpretability provided by set theoretic operations.- Anthology ID:
- 2020.repl4nlp-1.4
- Volume:
- Proceedings of the 5th Workshop on Representation Learning for NLP
- Month:
- July
- Year:
- 2020
- Address:
- Online
- Editors:
- Spandana Gella, Johannes Welbl, Marek Rei, Fabio Petroni, Patrick Lewis, Emma Strubell, Minjoon Seo, Hannaneh Hajishirzi
- Venue:
- RepL4NLP
- SIG:
- SIGREP
- Publisher:
- Association for Computational Linguistics
- Note:
- Pages:
- 24–33
- Language:
- URL:
- https://aclanthology.org/2020.repl4nlp-1.4
- DOI:
- 10.18653/v1/2020.repl4nlp-1.4
- Cite (ACL):
- Siddharth Bhat, Alok Debnath, Souvik Banerjee, and Manish Shrivastava. 2020. Word Embeddings as Tuples of Feature Probabilities. In Proceedings of the 5th Workshop on Representation Learning for NLP, pages 24–33, Online. Association for Computational Linguistics.
- Cite (Informal):
- Word Embeddings as Tuples of Feature Probabilities (Bhat et al., RepL4NLP 2020)
- PDF:
- https://preview.aclanthology.org/naacl-24-ws-corrections/2020.repl4nlp-1.4.pdf