Semi-Supervised Dependency Parsing with Arc-Factored Variational Autoencoding

Ge Wang, Kewei Tu


Abstract
Mannual annotation for dependency parsing is both labourious and time costly, resulting in the difficulty to learn practical dependency parsers for many languages due to the lack of labelled training corpora. To compensate for the scarcity of labelled data, semi-supervised dependency parsing methods are developed to utilize unlabelled data in the training procedure of dependency parsers. In previous work, the autoencoder framework is a prevalent approach for the utilization of unlabelled data. In this framework, training sentences are reconstructed from a decoder conditioned on dependency trees predicted by an encoder. The tree structure requirement brings challenges for both the encoder and the decoder. Sophisticated techniques are employed to tackle these challenges at the expense of model complexity and approximations in encoding and decoding. In this paper, we propose a model based on the variational autoencoder framework. By relaxing the tree constraint in both the encoder and the decoder during training, we make the learning of our model fully arc-factored and thus circumvent the challenges brought by the tree constraint. We evaluate our model on datasets across several languages and the results demonstrate the advantage of our model over previous approaches in both parsing accuracy and speed.
Anthology ID:
2020.coling-main.224
Volume:
Proceedings of the 28th International Conference on Computational Linguistics
Month:
December
Year:
2020
Address:
Barcelona, Spain (Online)
Venue:
COLING
SIG:
Publisher:
International Committee on Computational Linguistics
Note:
Pages:
2485–2496
Language:
URL:
https://aclanthology.org/2020.coling-main.224
DOI:
10.18653/v1/2020.coling-main.224
Bibkey:
Cite (ACL):
Ge Wang and Kewei Tu. 2020. Semi-Supervised Dependency Parsing with Arc-Factored Variational Autoencoding. In Proceedings of the 28th International Conference on Computational Linguistics, pages 2485–2496, Barcelona, Spain (Online). International Committee on Computational Linguistics.
Cite (Informal):
Semi-Supervised Dependency Parsing with Arc-Factored Variational Autoencoding (Wang & Tu, COLING 2020)
Copy Citation:
PDF:
https://preview.aclanthology.org/update-css-js/2020.coling-main.224.pdf
Data
Penn Treebank