Learning Topic-Sensitive Word Representations

Marzieh Fadaee, Arianna Bisazza, Christof Monz


Abstract
Distributed word representations are widely used for modeling words in NLP tasks. Most of the existing models generate one representation per word and do not consider different meanings of a word. We present two approaches to learn multiple topic-sensitive representations per word by using Hierarchical Dirichlet Process. We observe that by modeling topics and integrating topic distributions for each document we obtain representations that are able to distinguish between different meanings of a given word. Our models yield statistically significant improvements for the lexical substitution task indicating that commonly used single word representations, even when combined with contextual information, are insufficient for this task.
Anthology ID:
P17-2070
Volume:
Proceedings of the 55th Annual Meeting of the Association for Computational Linguistics (Volume 2: Short Papers)
Month:
July
Year:
2017
Address:
Vancouver, Canada
Venue:
ACL
SIG:
Publisher:
Association for Computational Linguistics
Note:
Pages:
441–447
Language:
URL:
https://aclanthology.org/P17-2070
DOI:
10.18653/v1/P17-2070
Bibkey:
Cite (ACL):
Marzieh Fadaee, Arianna Bisazza, and Christof Monz. 2017. Learning Topic-Sensitive Word Representations. In Proceedings of the 55th Annual Meeting of the Association for Computational Linguistics (Volume 2: Short Papers), pages 441–447, Vancouver, Canada. Association for Computational Linguistics.
Cite (Informal):
Learning Topic-Sensitive Word Representations (Fadaee et al., ACL 2017)
Copy Citation:
PDF:
https://preview.aclanthology.org/emnlp-22-attachments/P17-2070.pdf
Presentation:
 P17-2070.Presentation.pdf
Code
 marziehf/TS_Embeddings