Bayesian Learning for Neural Dependency Parsing

Ehsan Shareghi, Yingzhen Li, Yi Zhu, Roi Reichart, Anna Korhonen


Abstract
While neural dependency parsers provide state-of-the-art accuracy for several languages, they still rely on large amounts of costly labeled training data. We demonstrate that in the small data regime, where uncertainty around parameter estimation and model prediction matters the most, Bayesian neural modeling is very effective. In order to overcome the computational and statistical costs of the approximate inference step in this framework, we utilize an efficient sampling procedure via stochastic gradient Langevin dynamics to generate samples from the approximated posterior. Moreover, we show that our Bayesian neural parser can be further improved when integrated into a multi-task parsing and POS tagging framework, designed to minimize task interference via an adversarial procedure. When trained and tested on 6 languages with less than 5k training instances, our parser consistently outperforms the strong bilstm baseline (Kiperwasser and Goldberg, 2016). Compared with the biaffine parser (Dozat et al., 2017) our model achieves an improvement of up to 3% for Vietnames and Irish, while our multi-task model achieves an improvement of up to 9% across five languages: Farsi, Russian, Turkish, Vietnamese, and Irish.
Anthology ID:
N19-1354
Volume:
Proceedings of the 2019 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Volume 1 (Long and Short Papers)
Month:
June
Year:
2019
Address:
Minneapolis, Minnesota
Editors:
Jill Burstein, Christy Doran, Thamar Solorio
Venue:
NAACL
SIG:
Publisher:
Association for Computational Linguistics
Note:
Pages:
3509–3519
Language:
URL:
https://aclanthology.org/N19-1354
DOI:
10.18653/v1/N19-1354
Bibkey:
Cite (ACL):
Ehsan Shareghi, Yingzhen Li, Yi Zhu, Roi Reichart, and Anna Korhonen. 2019. Bayesian Learning for Neural Dependency Parsing. In Proceedings of the 2019 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Volume 1 (Long and Short Papers), pages 3509–3519, Minneapolis, Minnesota. Association for Computational Linguistics.
Cite (Informal):
Bayesian Learning for Neural Dependency Parsing (Shareghi et al., NAACL 2019)
Copy Citation:
PDF:
https://preview.aclanthology.org/naacl24-info/N19-1354.pdf
Video:
 https://vimeo.com/361752492