Abstract
The connection between dependency trees and spanning trees is exploited by the NLP community to train and to decode graph-based dependency parsers. However, the NLP literature has missed an important difference between the two structures: only one edge may emanate from the root in a dependency tree. We analyzed the output of state-of-the-art parsers on many languages from the Universal Dependency Treebank: although these parsers are often able to learn that trees which violate the constraint should be assigned lower probabilities, their ability to do so unsurprisingly de-grades as the size of the training set decreases. In fact, the worst constraint-violation rate we observe is 24%. Prior work has proposed an inefficient algorithm to enforce the constraint, which adds a factor of n to the decoding runtime. We adapt an algorithm due to Gabow and Tarjan (1984) to dependency parsing, which satisfies the constraint without compromising the original runtime.- Anthology ID:
- 2020.emnlp-main.390
- Volume:
- Proceedings of the 2020 Conference on Empirical Methods in Natural Language Processing (EMNLP)
- Month:
- November
- Year:
- 2020
- Address:
- Online
- Editors:
- Bonnie Webber, Trevor Cohn, Yulan He, Yang Liu
- Venue:
- EMNLP
- SIG:
- Publisher:
- Association for Computational Linguistics
- Note:
- Pages:
- 4809–4819
- Language:
- URL:
- https://aclanthology.org/2020.emnlp-main.390
- DOI:
- 10.18653/v1/2020.emnlp-main.390
- Cite (ACL):
- Ran Zmigrod, Tim Vieira, and Ryan Cotterell. 2020. Please Mind the Root: Decoding Arborescences for Dependency Parsing. In Proceedings of the 2020 Conference on Empirical Methods in Natural Language Processing (EMNLP), pages 4809–4819, Online. Association for Computational Linguistics.
- Cite (Informal):
- Please Mind the Root: Decoding Arborescences for Dependency Parsing (Zmigrod et al., EMNLP 2020)
- PDF:
- https://preview.aclanthology.org/revert-3132-ingestion-checklist/2020.emnlp-main.390.pdf
- Code
- rycolab/spanningtrees