Nicoline van der Sijs


The EDGeS Diachronic Bible Corpus
Gerlof Bouma | Evie Coussé | Trude Dijkstra | Nicoline van der Sijs
Proceedings of the Twelfth Language Resources and Evaluation Conference

We present the EDGeS Diachronic Bible Corpus: a diachronically and synchronically parallel corpus of Bible translations in Dutch, English, German and Swedish, with texts from the 14th century until today. It is compiled in the context of an intended longitudinal and contrastive study of complex verb constructions in Germanic. The paper discusses the corpus design principles, its selection of 36 Bibles, and the information and metadata encoded for the corpus texts. The EDGeS corpus will be available in two forms: the whole corpus will be accessible for researchers behind a login in the well-known OPUS search infrastructure, and the open subpart of the corpus will be available for download.


A Fast and Flexible Webinterface for Dialect Research in the Low Countries
Roeland van Hout | Nicoline van der Sijs | Erwin Komen | Henk van den Heuvel
Proceedings of the Eleventh International Conference on Language Resources and Evaluation (LREC 2018)


Nederlab: Towards a Single Portal and Research Environment for Diachronic Dutch Text Corpora
Hennie Brugman | Martin Reynaert | Nicoline van der Sijs | René van Stipriaan | Erik Tjong Kim Sang | Antal van den Bosch
Proceedings of the Tenth International Conference on Language Resources and Evaluation (LREC'16)

The Nederlab project aims to bring together all digitized texts relevant to the Dutch national heritage, the history of the Dutch language and culture (circa 800 – present) in one user friendly and tool enriched open access web interface. This paper describes Nederlab halfway through the project period and discusses the collections incorporated, back-office processes, system back-end as well as the Nederlab Research Portal end-user web application.

Curation of Dutch Regional Dictionaries
Henk van den Heuvel | Eric Sanders | Nicoline van der Sijs
Proceedings of the Tenth International Conference on Language Resources and Evaluation (LREC'16)

This paper describes the process of semi-automatically converting dictionaries from paper to structured text (database) and the integration of these into the CLARIN infrastructure in order to make the dictionaries accessible and retrievable for the research community. The case study at hand is that of the curation of 42 fascicles of the Dictionaries of the Brabantic and Limburgian dialects, and 6 fascicles of the Dictionary of dialects in Gelderland.