Embedding Words as Distributions with a Bayesian Skip-gram Model

Arthur Bražinskas; Serhii Havrylov; Ivan Titov

Embedding Words as Distributions with a Bayesian Skip-gram Model

Arthur Bražinskas, Serhii Havrylov, Ivan Titov

Abstract

We introduce a method for embedding words as probability densities in a low-dimensional space. Rather than assuming that a word embedding is fixed across the entire text collection, as in standard word embedding methods, in our Bayesian model we generate it from a word-specific prior density for each occurrence of a given word. Intuitively, for each word, the prior density encodes the distribution of its potential ‘meanings’. These prior densities are conceptually similar to Gaussian embeddings of ėwcitevilnis2014word. Interestingly, unlike the Gaussian embeddings, we can also obtain context-specific densities: they encode uncertainty about the sense of a word given its context and correspond to the approximate posterior distributions within our model. The context-dependent densities have many potential applications: for example, we show that they can be directly used in the lexical substitution task. We describe an effective estimation method based on the variational autoencoding framework. We demonstrate the effectiveness of our embedding technique on a range of standard benchmarks.

Anthology ID:: C18-1151
Volume:: Proceedings of the 27th International Conference on Computational Linguistics
Month:: August
Year:: 2018
Address:: Santa Fe, New Mexico, USA
Editors:: Emily M. Bender, Leon Derczynski, Pierre Isabelle
Venue:: COLING
SIG:
Publisher:: Association for Computational Linguistics
Note:
Pages:: 1775–1789
Language:
URL:: https://preview.aclanthology.org/jlcl-multiple-ingestion/C18-1151/
DOI:
Bibkey:
Cite (ACL):: Arthur Bražinskas, Serhii Havrylov, and Ivan Titov. 2018. Embedding Words as Distributions with a Bayesian Skip-gram Model. In Proceedings of the 27th International Conference on Computational Linguistics, pages 1775–1789, Santa Fe, New Mexico, USA. Association for Computational Linguistics.
Cite (Informal):: Embedding Words as Distributions with a Bayesian Skip-gram Model (Bražinskas et al., COLING 2018)
Copy Citation:
PDF:: https://preview.aclanthology.org/jlcl-multiple-ingestion/C18-1151.pdf
Code: ixlan/BSG

PDF Cite Search Code Fix data