Abstract
Monolingual dictionaries are widespread and semantically rich resources. This paper presents a simple model that learns to compute word embeddings by processing dictionary definitions and trying to reconstruct them. It exploits the inherent recursivity of dictionaries by encouraging consistency between the representations it uses as inputs and the representations it produces as outputs. The resulting embeddings are shown to capture semantic similarity better than regular distributional methods and other dictionary-based methods. In addition, our method shows strong performance when trained exclusively on dictionary data and generalizes in one shot.- Anthology ID:
- D18-1181
- Volume:
- Proceedings of the 2018 Conference on Empirical Methods in Natural Language Processing
- Month:
- October-November
- Year:
- 2018
- Address:
- Brussels, Belgium
- Editors:
- Ellen Riloff, David Chiang, Julia Hockenmaier, Jun’ichi Tsujii
- Venue:
- EMNLP
- SIG:
- SIGDAT
- Publisher:
- Association for Computational Linguistics
- Note:
- Pages:
- 1522–1532
- Language:
- URL:
- https://aclanthology.org/D18-1181
- DOI:
- 10.18653/v1/D18-1181
- Cite (ACL):
- Tom Bosc and Pascal Vincent. 2018. Auto-Encoding Dictionary Definitions into Consistent Word Embeddings. In Proceedings of the 2018 Conference on Empirical Methods in Natural Language Processing, pages 1522–1532, Brussels, Belgium. Association for Computational Linguistics.
- Cite (Informal):
- Auto-Encoding Dictionary Definitions into Consistent Word Embeddings (Bosc & Vincent, EMNLP 2018)
- PDF:
- https://preview.aclanthology.org/nschneid-patch-3/D18-1181.pdf
- Code
- tombosc/cpae + additional community code