Discontinuous Constituency and BERT: A Case Study of Dutch

Konstantinos Kogkalidis, Gijs Wijnholds


Abstract
In this paper, we set out to quantify the syntactic capacity of BERT in the evaluation regime of non-context free patterns, as occurring in Dutch. We devise a test suite based on a mildly context-sensitive formalism, from which we derive grammars that capture the linguistic phenomena of control verb nesting and verb raising. The grammars, paired with a small lexicon, provide us with a large collection of naturalistic utterances, annotated with verb-subject pairings, that serve as the evaluation test bed for an attention-based span selection probe. Our results, backed by extensive analysis, suggest that the models investigated fail in the implicit acquisition of the dependencies examined.
Anthology ID:
2022.findings-acl.298
Volume:
Findings of the Association for Computational Linguistics: ACL 2022
Month:
May
Year:
2022
Address:
Dublin, Ireland
Editors:
Smaranda Muresan, Preslav Nakov, Aline Villavicencio
Venue:
Findings
SIG:
Publisher:
Association for Computational Linguistics
Note:
Pages:
3776–3785
Language:
URL:
https://aclanthology.org/2022.findings-acl.298
DOI:
10.18653/v1/2022.findings-acl.298
Bibkey:
Cite (ACL):
Konstantinos Kogkalidis and Gijs Wijnholds. 2022. Discontinuous Constituency and BERT: A Case Study of Dutch. In Findings of the Association for Computational Linguistics: ACL 2022, pages 3776–3785, Dublin, Ireland. Association for Computational Linguistics.
Cite (Informal):
Discontinuous Constituency and BERT: A Case Study of Dutch (Kogkalidis & Wijnholds, Findings 2022)
Copy Citation:
PDF:
https://preview.aclanthology.org/naacl24-info/2022.findings-acl.298.pdf
Software:
 2022.findings-acl.298.software.zip
Code
 gijswijnholds/discontinuous-probing