Niklas Deworetzki


2026

Syntactic query languages such as Grew and dep_search allow looking for grammatical patterns in linguistically annotated corpora. However, these languages are often unsupported by large-scale corpus management tools, where queries are of an essentially sequential nature. In this paper, we present CQP/Tree, a tool to convert syntactic queries into CQL, the Corpus Query Language used in Corpus Workbench, SketchEngine, Korp and several other such systems. In this framework, syntactic queries act as _syntactic sugar_: they allow expressing complex CQL queries in a more readable and concise fashion, thus bridging the gap between expressive linguistic search and large-scale corpora. CQP/Tree is available as a web and command-line tool, as well as an open source Python library.

2025

We investigate if labeled property graphs, and graph databases, can be an useful and efficient way of encoding UD treebanks, to facilitate searching for complex syntactic phenomena. We give two alternative encodings of UD treebanks into the off-the-shelf graph database Neo4j, and show how to translate syntactic queries into the graph query language Cypher. Our evaluation shows that graph databases can improve query times by several orders of magnitude, compared to existing approaches.