Abstract
Distributed word representation is an efficient method for capturing semantic and syntactic word relations. In this work, we introduce an extension to the continuous bag-of-words model for learning word representations efficiently by using implicit structure information. Instead of relying on a syntactic parser which might be noisy and slow to build, we compute weights representing probabilities of syntactic relations based on the Huffman softmax tree in an efficient heuristic. The constructed “implicit graphs” from these weights show that these weights contain useful implicit structure information. Extensive experiments performed on several word similarity and word analogy tasks show gains compared to the basic continuous bag-of-words model.- Anthology ID:
- C16-1227
- Volume:
- Proceedings of COLING 2016, the 26th International Conference on Computational Linguistics: Technical Papers
- Month:
- December
- Year:
- 2016
- Address:
- Osaka, Japan
- Editors:
- Yuji Matsumoto, Rashmi Prasad
- Venue:
- COLING
- SIG:
- Publisher:
- The COLING 2016 Organizing Committee
- Note:
- Pages:
- 2408–2417
- Language:
- URL:
- https://preview.aclanthology.org/remove-affiliations/C16-1227/
- DOI:
- Cite (ACL):
- Jie Shen and Cong Liu. 2016. Improved Word Embeddings with Implicit Structure Information. In Proceedings of COLING 2016, the 26th International Conference on Computational Linguistics: Technical Papers, pages 2408–2417, Osaka, Japan. The COLING 2016 Organizing Committee.
- Cite (Informal):
- Improved Word Embeddings with Implicit Structure Information (Shen & Liu, COLING 2016)
- PDF:
- https://preview.aclanthology.org/remove-affiliations/C16-1227.pdf