Daniël de Kok


When Beards Start Shaving Men: A Subject-object Resolution Test Suite for Morpho-syntactic and Semantic Model Introspection
Patricia Fischer | Daniël de Kok | Erhard Hinrichs
Proceedings of the 28th International Conference on Computational Linguistics

In this paper, we introduce the SORTS Subject-Object Resolution Test Suite of German minimal sentence pairs for model introspection. The full test suite consists of 18,502 transitive clauses with manual annotations of 8 word order patterns, 5 morphological and syntactic and 11 semantic property classes. The test suite has been constructed such that sentences are minimal pairs with respect to a property class. Each property has been selected with a particular focus on its effect on subject-object resolution, the second-most error-prone task within syntactic parsing of German after prepositional phrase attachment (Fischer et al., 2019). The size and detail of annotations make the test suite a valuable resource for natural language processing applications with syntactic and semantic tasks. We use dependency parsing to demonstrate how the test suite allows insights into the process of subject-object resolution. Based on the test suite annotations, word order and case syncretism can be identified as most important factors that affect subject-object resolution.


No Word is an Island—A Transformation Weighting Model for Semantic Composition
Corina Dima | Daniël de Kok | Neele Witte | Erhard Hinrichs
Transactions of the Association for Computational Linguistics, Volume 7

Composition models of distributional semantics are used to construct phrase representations from the representations of their words. Composition models are typically situated on two ends of a spectrum. They either have a small number of parameters but compose all phrases in the same way, or they perform word-specific compositions at the cost of a far larger number of parameters. In this paper we propose transformation weighting (TransWeight), a composition model that consistently outperforms existing models on nominal compounds, adjective-noun phrases, and adverb-adjective phrases in English, German, and Dutch. TransWeight drastically reduces the number of parameters needed compared with the best model in the literature by composing similar words in the same way.

Association Metrics in Neural Transition-Based Dependency Parsing
Patricia Fischer | Sebastian Pütz | Daniël de Kok
Proceedings of the Fifth International Conference on Dependency Linguistics (Depling, SyntaxFest 2019)


Distributional regularities of verbs and verbal adjectives: Treebank evidence and broader implications
Daniël de Kok | Patricia Fischer | Corina Dima | Erhard Hinrichs
Proceedings of the 16th International Workshop on Treebanks and Linguistic Theories

PP Attachment: Where do We Stand?
Daniël de Kok | Jianqiang Ma | Corina Dima | Erhard Hinrichs
Proceedings of the 15th Conference of the European Chapter of the Association for Computational Linguistics: Volume 2, Short Papers

Prepostitional phrase (PP) attachment is a well known challenge to parsing. In this paper, we combine the insights of different works, namely: (1) treating PP attachment as a classification task with an arbitrary number of attachment candidates; (2) using auxiliary distributions to augment the data beyond the hand-annotated training set; (3) using topological fields to get information about the distribution of PP attachment throughout clauses and (4) using state-of-the-art techniques such as word embeddings and neural networks. We show that jointly using these techniques leads to substantial improvements. We also conduct a qualitative analysis to gauge where the ceiling of the task is in a realistic setup.


pdf bib
Transition-based dependency parsing with topological fields
Daniël de Kok | Erhard Hinrichs
Proceedings of the 54th Annual Meeting of the Association for Computational Linguistics (Volume 2: Short Papers)


Discriminative features in reversible stochastic attribute-value grammars
Daniël de Kok
Proceedings of the UCNLG+Eval: Language Generation and Evaluation Workshop

Reversible Stochastic Attribute-Value Grammars
Daniël de Kok | Barbara Plank | Gertjan van Noord
Proceedings of the 49th Annual Meeting of the Association for Computational Linguistics: Human Language Technologies


Feature Selection for Fluency Ranking
Daniël de Kok
Proceedings of the 6th International Natural Language Generation Conference


A generalized method for iterative error mining in parsing results
Daniël de Kok | Jianqiang Ma | Gertjan van Noord
Proceedings of the 2009 Workshop on Grammar Engineering Across Frameworks (GEAF 2009)