Approximation-Aware Dependency Parsing by Belief Propagation

Matthew R. Gormley, Mark Dredze, Jason Eisner


Abstract
We show how to train the fast dependency parser of Smith and Eisner (2008) for improved accuracy. This parser can consider higher-order interactions among edges while retaining O(n3) runtime. It outputs the parse with maximum expected recall—but for speed, this expectation is taken under a posterior distribution that is constructed only approximately, using loopy belief propagation through structured factors. We show how to adjust the model parameters to compensate for the errors introduced by this approximation, by following the gradient of the actual loss on training data. We find this gradient by back-propagation. That is, we treat the entire parser (approximations and all) as a differentiable circuit, as others have done for loopy CRFs (Domke, 2010; Stoyanov et al., 2011; Domke, 2011; Stoyanov and Eisner, 2012). The resulting parser obtains higher accuracy with fewer iterations of belief propagation than one trained by conditional log-likelihood.
Anthology ID:
Q15-1035
Volume:
Transactions of the Association for Computational Linguistics, Volume 3
Month:
Year:
2015
Address:
Cambridge, MA
Editors:
Michael Collins, Lillian Lee
Venue:
TACL
SIG:
Publisher:
MIT Press
Note:
Pages:
489–501
Language:
URL:
https://aclanthology.org/Q15-1035
DOI:
10.1162/tacl_a_00153
Bibkey:
Cite (ACL):
Matthew R. Gormley, Mark Dredze, and Jason Eisner. 2015. Approximation-Aware Dependency Parsing by Belief Propagation. Transactions of the Association for Computational Linguistics, 3:489–501.
Cite (Informal):
Approximation-Aware Dependency Parsing by Belief Propagation (Gormley et al., TACL 2015)
Copy Citation:
PDF:
https://preview.aclanthology.org/nschneid-patch-3/Q15-1035.pdf
Video:
 https://preview.aclanthology.org/nschneid-patch-3/Q15-1035.mp4
Data
Penn Treebank