Abstract
Word vectors with varying dimensionalities and produced by different algorithms have been extensively used in NLP. The corpora that the algorithms are trained on can contain either natural language text (e.g. Wikipedia or newswire articles) or artificially-generated pseudo corpora due to natural data sparseness. We exploit Lexical Chain based templates over Knowledge Graph for generating pseudo-corpora with controlled linguistic value. These corpora are then used for learning word embeddings. A number of experiments have been conducted over the following test sets: WordSim353 Similarity, WordSim353 Relatedness and SimLex-999. The results show that, on the one hand, the incorporation of many-relation lexical chains improves results, but on the other hand, unrestricted-length chains remain difficult to handle with respect to their huge quantity.- Anthology ID:
- R17-1087
- Volume:
- Proceedings of the International Conference Recent Advances in Natural Language Processing, RANLP 2017
- Month:
- September
- Year:
- 2017
- Address:
- Varna, Bulgaria
- Editors:
- Ruslan Mitkov, Galia Angelova
- Venue:
- RANLP
- SIG:
- Publisher:
- INCOMA Ltd.
- Note:
- Pages:
- 679–685
- Language:
- URL:
- https://doi.org/10.26615/978-954-452-049-6_087
- DOI:
- 10.26615/978-954-452-049-6_087
- Cite (ACL):
- Kiril Simov, Svetla Boytcheva, and Petya Osenova. 2017. Towards Lexical Chains for Knowledge-Graph-based Word Embeddings. In Proceedings of the International Conference Recent Advances in Natural Language Processing, RANLP 2017, pages 679–685, Varna, Bulgaria. INCOMA Ltd..
- Cite (Informal):
- Towards Lexical Chains for Knowledge-Graph-based Word Embeddings (Simov et al., RANLP 2017)
- PDF:
- https://doi.org/10.26615/978-954-452-049-6_087