Abstract
We propose a simple and efficient framework to learn syntactic embeddings based on information derived from constituency parse trees. Using biased random walk methods, our embeddings not only encode syntactic information about words, but they also capture contextual information. We also propose a method to train the embeddings on multiple constituency parse trees to ensure the encoding of global syntactic representation. Quantitative evaluation of the embeddings show a competitive performance on POS tagging task when compared to other types of embeddings, and qualitative evaluation reveals interesting facts about the syntactic typology learned by these embeddings.- Anthology ID:
- 2020.textgraphs-1.8
- Volume:
- Proceedings of the Graph-based Methods for Natural Language Processing (TextGraphs)
- Month:
- December
- Year:
- 2020
- Address:
- Barcelona, Spain (Online)
- Editors:
- Dmitry Ustalov, Swapna Somasundaran, Alexander Panchenko, Fragkiskos D. Malliaros, Ioana Hulpuș, Peter Jansen, Abhik Jana
- Venue:
- TextGraphs
- SIG:
- Publisher:
- Association for Computational Linguistics
- Note:
- Pages:
- 72–78
- Language:
- URL:
- https://aclanthology.org/2020.textgraphs-1.8
- DOI:
- 10.18653/v1/2020.textgraphs-1.8
- Cite (ACL):
- Ragheb Al-Ghezi and Mikko Kurimo. 2020. Graph-based Syntactic Word Embeddings. In Proceedings of the Graph-based Methods for Natural Language Processing (TextGraphs), pages 72–78, Barcelona, Spain (Online). Association for Computational Linguistics.
- Cite (Informal):
- Graph-based Syntactic Word Embeddings (Al-Ghezi & Kurimo, TextGraphs 2020)
- PDF:
- https://preview.aclanthology.org/bionlp-24-ingestion/2020.textgraphs-1.8.pdf