Abstract
Distributional word representations are widely used in NLP tasks. These representations are based on an assumption that words with a similar context tend to have a similar meaning. To improve the quality of the context-based embeddings, many researches have explored how to make full use of existing lexical resources. In this paper, we argue that while we incorporate the prior knowledge with context-based embeddings, words with different occurrences should be treated differently. Therefore, we propose to rely on the measurement of information content to control the degree of applying prior knowledge into context-based embeddings - different words would have different learning rates when adjusting their embeddings. In the result, we demonstrate that our embeddings get significant improvements on two different tasks: Word Similarity and Analogical Reasoning.- Anthology ID:
- E17-2082
- Volume:
- Proceedings of the 15th Conference of the European Chapter of the Association for Computational Linguistics: Volume 2, Short Papers
- Month:
- April
- Year:
- 2017
- Address:
- Valencia, Spain
- Editors:
- Mirella Lapata, Phil Blunsom, Alexander Koller
- Venue:
- EACL
- SIG:
- Publisher:
- Association for Computational Linguistics
- Note:
- Pages:
- 509–515
- Language:
- URL:
- https://aclanthology.org/E17-2082
- DOI:
- Cite (ACL):
- Hsin-Yang Wang and Wei-Yun Ma. 2017. Integrating Semantic Knowledge into Lexical Embeddings Based on Information Content Measurement. In Proceedings of the 15th Conference of the European Chapter of the Association for Computational Linguistics: Volume 2, Short Papers, pages 509–515, Valencia, Spain. Association for Computational Linguistics.
- Cite (Informal):
- Integrating Semantic Knowledge into Lexical Embeddings Based on Information Content Measurement (Wang & Ma, EACL 2017)
- PDF:
- https://preview.aclanthology.org/ml4al-ingestion/E17-2082.pdf
- Code
- hywangntut/KBE