Andrew Wilson

Also published as: Andrew T. Wilson


pdf bib
Probabilistic FastText for Multi-Sense Word Embeddings
Ben Athiwaratkun | Andrew Wilson | Anima Anandkumar
Proceedings of the 56th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers)

We introduce Probabilistic FastText, a new model for word embeddings that can capture multiple word senses, sub-word structure, and uncertainty information. In particular, we represent each word with a Gaussian mixture density, where the mean of a mixture component is given by the sum of n-grams. This representation allows the model to share the “strength” across sub-word structures (e.g. Latin roots), producing accurate representations of rare, misspelt, or even unseen words. Moreover, each component of the mixture can capture a different word sense. Probabilistic FastText outperforms both FastText, which has no probabilistic model, and dictionary-level probabilistic embeddings, which do not incorporate subword structures, on several word-similarity benchmarks, including English RareWord and foreign language datasets. We also achieve state-of-art performance on benchmarks that measure ability to discern different meanings. Thus, our model is the first to achieve best of both the worlds: multi-sense representations while having enriched semantics on rare words.


Multimodal Word Distributions
Ben Athiwaratkun | Andrew Wilson
Proceedings of the 55th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers)

Word embeddings provide point representations of words containing useful semantic information. We introduce multimodal word distributions formed from Gaussian mixtures, for multiple word meanings, entailment, and rich uncertainty information. To learn these distributions, we propose an energy-based max-margin objective. We show that the resulting approach captures uniquely expressive semantic information, and outperforms alternatives, such as word2vec skip-grams, and Gaussian embeddings, on benchmark datasets such as word similarity and entailment.


Term Weighting Schemes for Latent Dirichlet Allocation
Andrew T. Wilson | Peter A. Chew
Human Language Technologies: The 2010 Annual Conference of the North American Chapter of the Association for Computational Linguistics


pdf bib
Measuring MWE Compositionality Using Semantic Annotation
Scott S.L. Piao | Paul Rayson | Olga Mudraya | Andrew Wilson | Roger Garside
Proceedings of the Workshop on Multiword Expressions: Identifying and Exploiting Underlying Properties


Extracting Multiword Expressions with A Semantic Tagger
Scott S. L. Piao | Paul Rayson | Dawn Archer | Andrew Wilson | Tony McEnery
Proceedings of the ACL 2003 Workshop on Multiword Expressions: Analysis, Acquisition and Treatment