A La Carte Embedding: Cheap but Effective Induction of Semantic Feature Vectors

Mikhail Khodak, Nikunj Saunshi, Yingyu Liang, Tengyu Ma, Brandon Stewart, Sanjeev Arora

[How to correct problems with metadata yourself]


Abstract
Motivations like domain adaptation, transfer learning, and feature learning have fueled interest in inducing embeddings for rare or unseen words, n-grams, synsets, and other textual features. This paper introduces a la carte embedding, a simple and general alternative to the usual word2vec-based approaches for building such representations that is based upon recent theoretical results for GloVe-like embeddings. Our method relies mainly on a linear transformation that is efficiently learnable using pretrained word vectors and linear regression. This transform is applicable on the fly in the future when a new text feature or rare word is encountered, even if only a single usage example is available. We introduce a new dataset showing how the a la carte method requires fewer examples of words in context to learn high-quality embeddings and we obtain state-of-the-art results on a nonce task and some unsupervised document classification tasks.
Anthology ID:
P18-1002
Volume:
Proceedings of the 56th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers)
Month:
July
Year:
2018
Address:
Melbourne, Australia
Editors:
Iryna Gurevych, Yusuke Miyao
Venue:
ACL
SIG:
Publisher:
Association for Computational Linguistics
Note:
Pages:
12–22
Language:
URL:
https://aclanthology.org/P18-1002
DOI:
10.18653/v1/P18-1002
Bibkey:
Cite (ACL):
Mikhail Khodak, Nikunj Saunshi, Yingyu Liang, Tengyu Ma, Brandon Stewart, and Sanjeev Arora. 2018. A La Carte Embedding: Cheap but Effective Induction of Semantic Feature Vectors. In Proceedings of the 56th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), pages 12–22, Melbourne, Australia. Association for Computational Linguistics.
Cite (Informal):
A La Carte Embedding: Cheap but Effective Induction of Semantic Feature Vectors (Khodak et al., ACL 2018)
Copy Citation:
PDF:
https://preview.aclanthology.org/teach-a-man-to-fish/P18-1002.pdf
Presentation:
 P18-1002.Presentation.pdf
Video:
 https://preview.aclanthology.org/teach-a-man-to-fish/P18-1002.mp4
Code
 NLPrinceton/ALaCarte
Data
IMDb Movie ReviewsMPQA Opinion CorpusMRSSTSST-2SST-5SUBJ