Abstract
We present models for embedding words in the context of surrounding words. Such models, which we refer to as token embeddings, represent the characteristics of a word that are specific to a given context, such as word sense, syntactic category, and semantic role. We explore simple, efficient token embedding models based on standard neural network architectures. We learn token embeddings on a large amount of unannotated text and evaluate them as features for part-of-speech taggers and dependency parsers trained on much smaller amounts of annotated data. We find that predictors endowed with token embeddings consistently outperform baseline predictors across a range of context window and training set sizes.- Anthology ID:
- W17-2632
- Volume:
- Proceedings of the 2nd Workshop on Representation Learning for NLP
- Month:
- August
- Year:
- 2017
- Address:
- Vancouver, Canada
- Editors:
- Phil Blunsom, Antoine Bordes, Kyunghyun Cho, Shay Cohen, Chris Dyer, Edward Grefenstette, Karl Moritz Hermann, Laura Rimell, Jason Weston, Scott Yih
- Venue:
- RepL4NLP
- SIG:
- SIGREP
- Publisher:
- Association for Computational Linguistics
- Note:
- Pages:
- 265–275
- Language:
- URL:
- https://aclanthology.org/W17-2632
- DOI:
- 10.18653/v1/W17-2632
- Cite (ACL):
- Lifu Tu, Kevin Gimpel, and Karen Livescu. 2017. Learning to Embed Words in Context for Syntactic Tasks. In Proceedings of the 2nd Workshop on Representation Learning for NLP, pages 265–275, Vancouver, Canada. Association for Computational Linguistics.
- Cite (Informal):
- Learning to Embed Words in Context for Syntactic Tasks (Tu et al., RepL4NLP 2017)
- PDF:
- https://preview.aclanthology.org/emnlp22-frontmatter/W17-2632.pdf
- Data
- Penn Treebank