High-Accuracy Transition-Based Constituency Parsing

John Bauer, Christopher D. Manning


Abstract
Constituency parsers have improved markedly in recent years, with the F1 accuracy on the venerable Penn Treebank reaching 96.47, half of the error rate of the first transformer model in 2017. However, while dependency parsing frequently uses transition-based parsers, it is unclear whether transition-based parsing can still provide state-of-the-art results for constituency parsing. Despite promising work by Liu and Zhang in 2017 using an in-order transition-based parser, recent work uses other methods, mainly CKY charts built over LLM encoders. Starting from previous work, we implement self-training and a dynamic oracle to make a language-agnostic transition-based constituency parser. We test on seven languages; using Electra embeddings as the input layer on Penn Treebank, with a self-training dataset built from Wikipedia, our parser achieves a new SOTA F1 of 96.61.
Anthology ID:
2025.iwpt-1.4
Volume:
Proceedings of the 18th International Conference on Parsing Technologies (IWPT, SyntaxFest 2025)
Month:
August
Year:
2025
Address:
Ljubljana, Slovenia
Editors:
Kenji Sagae, Stephan Oepen
Venues:
IWPT | SyntaxFest
SIG:
SIGPARSE
Publisher:
Association for Computational Linguistics
Note:
Pages:
26–39
Language:
URL:
https://preview.aclanthology.org/transition-to-people-yaml/2025.iwpt-1.4/
DOI:
Bibkey:
Cite (ACL):
John Bauer and Christopher D. Manning. 2025. High-Accuracy Transition-Based Constituency Parsing. In Proceedings of the 18th International Conference on Parsing Technologies (IWPT, SyntaxFest 2025), pages 26–39, Ljubljana, Slovenia. Association for Computational Linguistics.
Cite (Informal):
High-Accuracy Transition-Based Constituency Parsing (Bauer & Manning, IWPT-SyntaxFest 2025)
Copy Citation:
PDF:
https://preview.aclanthology.org/transition-to-people-yaml/2025.iwpt-1.4.pdf