Integrating Semantic Knowledge into Lexical Embeddings Based on Information Content Measurement

Hsin-Yang Wang, Wei-Yun Ma


Abstract
Distributional word representations are widely used in NLP tasks. These representations are based on an assumption that words with a similar context tend to have a similar meaning. To improve the quality of the context-based embeddings, many researches have explored how to make full use of existing lexical resources. In this paper, we argue that while we incorporate the prior knowledge with context-based embeddings, words with different occurrences should be treated differently. Therefore, we propose to rely on the measurement of information content to control the degree of applying prior knowledge into context-based embeddings - different words would have different learning rates when adjusting their embeddings. In the result, we demonstrate that our embeddings get significant improvements on two different tasks: Word Similarity and Analogical Reasoning.
Anthology ID:
E17-2082
Volume:
Proceedings of the 15th Conference of the European Chapter of the Association for Computational Linguistics: Volume 2, Short Papers
Month:
April
Year:
2017
Address:
Valencia, Spain
Editors:
Mirella Lapata, Phil Blunsom, Alexander Koller
Venue:
EACL
SIG:
Publisher:
Association for Computational Linguistics
Note:
Pages:
509–515
Language:
URL:
https://aclanthology.org/E17-2082
DOI:
Bibkey:
Cite (ACL):
Hsin-Yang Wang and Wei-Yun Ma. 2017. Integrating Semantic Knowledge into Lexical Embeddings Based on Information Content Measurement. In Proceedings of the 15th Conference of the European Chapter of the Association for Computational Linguistics: Volume 2, Short Papers, pages 509–515, Valencia, Spain. Association for Computational Linguistics.
Cite (Informal):
Integrating Semantic Knowledge into Lexical Embeddings Based on Information Content Measurement (Wang & Ma, EACL 2017)
Copy Citation:
PDF:
https://preview.aclanthology.org/ml4al-ingestion/E17-2082.pdf
Code
 hywangntut/KBE