Maarten Marx


2020

pdf bib
Who mentions whom? Recognizing political actors in proceedings
Lennart Kerkvliet | Jaap Kamps | Maarten Marx
Proceedings of the Second ParlaCLARIN Workshop

We show that it is straightforward to train a state of the art named entity tagger (spaCy) to recognize political actors in Dutch parliamentary proceedings with high accuracy. The tagger was trained on 3.4K manually labeled examples, which were created in a modest 2.5 days work. This resource is made available on github. Besides proper nouns of persons and political parties, the tagger can recognize quite complex definite descriptions referring to cabinet ministers, ministries, and parliamentary committees. We also provide a demo search engine which employs the tagged entities in its SERP and result summaries.

2010

pdf bib
DutchParl. The Parliamentary Documents in Dutch
Maarten Marx | Anne Schuth
Proceedings of the Seventh International Conference on Language Resources and Evaluation (LREC'10)

A corpus called DutchParl is created which aims to contain all digitally available parliamentary documents written in the Dutch language. The first version of DutchParl contains documents from the parliaments of The Netherlands, Flanders and Belgium. The corpus is divided along three dimensions: per parliament, scanned or digital documents, written recordings of spoken text and others. The digital collection contains more than 800 million tokens, the scanned collection more than 1 billion. All documents are available as UTF-8 encoded XML files with extensive metadata in Dublin Core standard. The text itself is divided into pages which are divided into paragraphs. Every document, page and paragraph has a unique URN which resolves to a web page. Every page element in the XML files is connected to a facsimile image of that page in PDF or JPEG format. We created a viewer in which both versions can be inspected simultaneously. The corpus is available for download in several formats. The corpus can be used for corpus-linguistic and political science research, and is suitable for performing scalability tests for XML information systems.

2004

pdf bib
Using WordNet to Measure Semantic Orientations of Adjectives
Jaap Kamps | Maarten Marx | Robert J. Mokken | Maarten de Rijke
Proceedings of the Fourth International Conference on Language Resources and Evaluation (LREC’04)