Abstract
Word embedding-based similarity measures are currently among the top-performing methods on unsupervised semantic textual similarity (STS) tasks. Recent work has increasingly adopted a statistical view on these embeddings, with some of the top approaches being essentially various correlations (which include the famous cosine similarity). Another excellent candidate for a similarity measure is mutual information (MI), which can capture arbitrary dependencies between the variables and has a simple and intuitive expression. Unfortunately, its use in the context of dense word embeddings has so far been avoided due to difficulties with estimating MI for continuous data. In this work we go through a vast literature on estimating MI in such cases and single out the most promising methods, yielding a simple and elegant similarity measure for word embeddings. We show that mutual information is a viable alternative to correlations, gives an excellent signal that correlates well with human judgements of similarity and rivals existing state-of-the-art unsupervised methods.- Anthology ID:
- 2020.acl-main.741
- Volume:
- Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics
- Month:
- July
- Year:
- 2020
- Address:
- Online
- Editors:
- Dan Jurafsky, Joyce Chai, Natalie Schluter, Joel Tetreault
- Venue:
- ACL
- SIG:
- Publisher:
- Association for Computational Linguistics
- Note:
- Pages:
- 8361–8371
- Language:
- URL:
- https://aclanthology.org/2020.acl-main.741
- DOI:
- 10.18653/v1/2020.acl-main.741
- Cite (ACL):
- Vitalii Zhelezniak, Aleksandar Savkov, and Nils Hammerla. 2020. Estimating Mutual Information Between Dense Word Embeddings. In Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics, pages 8361–8371, Online. Association for Computational Linguistics.
- Cite (Informal):
- Estimating Mutual Information Between Dense Word Embeddings (Zhelezniak et al., ACL 2020)
- PDF:
- https://preview.aclanthology.org/nschneid-patch-2/2020.acl-main.741.pdf