Marijn Koolen


2025

pdf bib
A Corpus of Early Modern Decision-Making - the Resolutions of the States General of the Dutch Republic
Marijn Koolen | Rik Hoekstra
Proceedings of the 5th Conference on Language, Data and Knowledge

46 This paper presents a corpus of early modern Dutch resolutions made in the daily meetings of the States General, the central governing body of the Dutch Republic, over a period of 220 years, from 1576 to 1796. This corpus has been digitised from over half a million scans of mostly handwritten text, segmented into individual resolutions (decisions) and enriched with named entities and metadata extracted from the text of the resolutions. We developed a pipeline for automatic text recognition for historic Dutch, and a document segmentation approach that combines ML classifiers trained on annotated data with rule-based fuzzy matching of the highly formulaic language of the resolutions. The decisions that the States General made were often based on propositions (requests or proposals) submitted in writing, by other governing bodies and by citizens of the republic. The resolutions contain information about these submitted propositions, including the persons and organisations who submitted them. The second part of this paper includes an analysis of the information about these proposition documents that can be extracted from the resolutions, and the potential to link the resolutions to their corresponding propositions using named entities and extracted metadata. We find that for the overwhelming majority of propositions, we can identify the name of person or organisation who submitted it, making it feasible to (semi-)automatically link the resolutions to their corresponding proposition documents. This will allow historians and genealogists to study not only the decision making of the States General in the early modern period, but also the concerns put forward by both high-ranking officials and regular citizens of the Republic.

2007

pdf bib
Deriving a Domain Specific Test Collection from a Query Log
Avi Arampatzis | Jaap Kamps | Marijn Koolen | Nir Nussbaum
Proceedings of the Workshop on Language Technology for Cultural Heritage Data (LaTeCH 2007).