The (Non-)Utility of Structural Features in BiLSTM-based Dependency Parsers

Agnieszka Falenska, Jonas Kuhn


Abstract
Classical non-neural dependency parsers put considerable effort on the design of feature functions. Especially, they benefit from information coming from structural features, such as features drawn from neighboring tokens in the dependency tree. In contrast, their BiLSTM-based successors achieve state-of-the-art performance without explicit information about the structural context. In this paper we aim to answer the question: How much structural context are the BiLSTM representations able to capture implicitly? We show that features drawn from partial subtrees become redundant when the BiLSTMs are used. We provide a deep insight into information flow in transition- and graph-based neural architectures to demonstrate where the implicit information comes from when the parsers make their decisions. Finally, with model ablations we demonstrate that the structural context is not only present in the models, but it significantly influences their performance.
Anthology ID:
P19-1012
Volume:
Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics
Month:
July
Year:
2019
Address:
Florence, Italy
Venue:
ACL
SIG:
Publisher:
Association for Computational Linguistics
Note:
Pages:
117–128
Language:
URL:
https://aclanthology.org/P19-1012
DOI:
10.18653/v1/P19-1012
Bibkey:
Cite (ACL):
Agnieszka Falenska and Jonas Kuhn. 2019. The (Non-)Utility of Structural Features in BiLSTM-based Dependency Parsers. In Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics, pages 117–128, Florence, Italy. Association for Computational Linguistics.
Cite (Informal):
The (Non-)Utility of Structural Features in BiLSTM-based Dependency Parsers (Falenska & Kuhn, ACL 2019)
Copy Citation:
PDF:
https://preview.aclanthology.org/update-css-js/P19-1012.pdf
Video:
 https://vimeo.com/383962400