Improved Dependency Parsing using Implicit Word Connections Learned from Unlabeled Data

Wenhui Wang, Baobao Chang, Mairgup Mansur


Abstract
Pre-trained word embeddings and language model have been shown useful in a lot of tasks. However, both of them cannot directly capture word connections in a sentence, which is important for dependency parsing given its goal is to establish dependency relations between words. In this paper, we propose to implicitly capture word connections from unlabeled data by a word ordering model with self-attention mechanism. Experiments show that these implicit word connections do improve our parsing model. Furthermore, by combining with a pre-trained language model, our model gets state-of-the-art performance on the English PTB dataset, achieving 96.35% UAS and 95.25% LAS.
Anthology ID:
D18-1311
Volume:
Proceedings of the 2018 Conference on Empirical Methods in Natural Language Processing
Month:
October-November
Year:
2018
Address:
Brussels, Belgium
Venue:
EMNLP
SIG:
SIGDAT
Publisher:
Association for Computational Linguistics
Note:
Pages:
2857–2863
Language:
URL:
https://aclanthology.org/D18-1311
DOI:
10.18653/v1/D18-1311
Bibkey:
Cite (ACL):
Wenhui Wang, Baobao Chang, and Mairgup Mansur. 2018. Improved Dependency Parsing using Implicit Word Connections Learned from Unlabeled Data. In Proceedings of the 2018 Conference on Empirical Methods in Natural Language Processing, pages 2857–2863, Brussels, Belgium. Association for Computational Linguistics.
Cite (Informal):
Improved Dependency Parsing using Implicit Word Connections Learned from Unlabeled Data (Wang et al., EMNLP 2018)
Copy Citation:
PDF:
https://preview.aclanthology.org/ingestion-script-update/D18-1311.pdf
Video:
 https://vimeo.com/305667813
Data
Penn Treebank