Abstract
Vector space representations of words capture many aspects of word similarity, but such methods tend to produce vector spaces in which antonyms (as well as synonyms) are close to each other. For spectral clustering using such word embeddings, words are points in a vector space where synonyms are linked with positive weights, while antonyms are linked with negative weights. We present a new signed spectral normalized graph cut algorithm, signed clustering, that overlays existing thesauri upon distributionally derived vector representations of words, so that antonym relationships between word pairs are represented by negative weights. Our signed clustering algorithm produces clusters of words that simultaneously capture distributional and synonym relations. By using randomized spectral decomposition (Halko et al., 2011) and sparse matrices, our method is both fast and scalable. We validate our clusters using datasets containing human judgments of word pair similarities and show the benefit of using our word clusters for sentiment prediction.- Anthology ID:
- P17-1087
- Volume:
- Proceedings of the 55th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers)
- Month:
- July
- Year:
- 2017
- Address:
- Vancouver, Canada
- Editors:
- Regina Barzilay, Min-Yen Kan
- Venue:
- ACL
- SIG:
- Publisher:
- Association for Computational Linguistics
- Note:
- Pages:
- 939–949
- Language:
- URL:
- https://aclanthology.org/P17-1087
- DOI:
- 10.18653/v1/P17-1087
- Cite (ACL):
- João Sedoc, Jean Gallier, Dean Foster, and Lyle Ungar. 2017. Semantic Word Clusters Using Signed Spectral Clustering. In Proceedings of the 55th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), pages 939–949, Vancouver, Canada. Association for Computational Linguistics.
- Cite (Informal):
- Semantic Word Clusters Using Signed Spectral Clustering (Sedoc et al., ACL 2017)
- PDF:
- https://preview.aclanthology.org/add_acl24_videos/P17-1087.pdf