Marten Postma


Variation in framing as a function of temporal reporting distance
Levi Remijnse | Marten Postma | Piek Vossen
Proceedings of the 14th International Conference on Computational Semantics (IWCS)

In this paper, we measure variation in framing as a function of foregrounding and backgrounding in a co-referential corpus with a range of temporal distance. In one type of experiment, frame-annotated corpora grouped under event types were contrasted, resulting in a ranking of frames with typicality rates. In contrasting between publication dates, a different ranking of frames emerged for documents that are close to or far from the event instance. In the second type of analysis, we trained a diagnostic classifier with frame occurrences in order to let it differentiate documents based on their temporal distance class (close to or far from the event instance). The classifier performs above chance and outperforms models with words.


Combining Conceptual and Referential Annotation to Study Variation in Framing
Marten Postma | Levi Remijnse | Filip Ilievski | Antske Fokkens | Sam Titarsolej | Piek Vossen
Proceedings of the International FrameNet Workshop 2020: Towards a Global, Multilingual FrameNet

We introduce an annotation tool whose purpose is to gain insights into variation of framing by combining FrameNet annotation with referential annotation. English FrameNet enables researchers to study variation in framing at the conceptual level as well through its packaging in language. We enrich FrameNet annotations in two ways. First, we introduce the referential aspect. Secondly, we annotate on complete texts to encode connections between mentions. As a result, we can analyze the variation of framing for one particular event across multiple mentions and (cross-lingual) documents. We can examine how an event is framed over time and how core frame elements are expressed throughout a complete text. The data model starts with a representation of an event type. Each event type has many incidents linked to it, and each incident has several reference texts describing it as well as structured data about the incident. The user can apply two types of annotations: 1) mappings from expressions to frames and frame elements, 2) reference relations from mentions to events and participants of the structured data.

Large-scale Cross-lingual Language Resources for Referencing and Framing
Piek Vossen | Filip Ilievski | Marten Postma | Antske Fokkens | Gosse Minnema | Levi Remijnse
Proceedings of the Twelfth Language Resources and Evaluation Conference

In this article, we lay out the basic ideas and principles of the project Framing Situations in the Dutch Language. We provide our first results of data acquisition, together with the first data release. We introduce the notion of cross-lingual referential corpora. These corpora consist of texts that make reference to exactly the same incidents. The referential grounding allows us to analyze the framing of these incidents in different languages and across different texts. During the project, we will use the automatically generated data to study linguistic framing as a phenomenon, build framing resources such as lexicons and corpora. We expect to capture larger variation in framing compared to traditional approaches for building such resources. Our first data release, which contains structured data about a large number of incidents and reference texts, can be found at


Don’t Annotate, but Validate: a Data-to-Text Method for Capturing Event Data
Piek Vossen | Filip Ilievski | Marten Postma | Roxane Segers
Proceedings of the Eleventh International Conference on Language Resources and Evaluation (LREC 2018)

SemEval-2018 Task 5: Counting Events and Participants in the Long Tail
Marten Postma | Filip Ilievski | Piek Vossen
Proceedings of the 12th International Workshop on Semantic Evaluation

This paper discusses SemEval-2018 Task 5: a referential quantification task of counting events and participants in local, long-tail news documents with high ambiguity. The complexity of this task challenges systems to establish the meaning, reference and identity across documents. The task consists of three subtasks and spans across three domains. We detail the design of this referential quantification task, describe the participating systems, and present additional analysis to gain deeper insight into their performance.

A Deep Dive into Word Sense Disambiguation with LSTM
Minh Le | Marten Postma | Jacopo Urbani | Piek Vossen
Proceedings of the 27th International Conference on Computational Linguistics

LSTM-based language models have been shown effective in Word Sense Disambiguation (WSD). In particular, the technique proposed by Yuan et al. (2016) returned state-of-the-art performance in several benchmarks, but neither the training data nor the source code was released. This paper presents the results of a reproduction study and analysis of this technique using only openly available datasets (GigaWord, SemCor, OMSTI) and software (TensorFlow). Our study showed that similar results can be obtained with much less data than hinted at by Yuan et al. (2016). Detailed analyses shed light on the strengths and weaknesses of this method. First, adding more unannotated training data is useful, but is subject to diminishing returns. Second, the model can correctly identify both popular and unpopular meanings. Finally, the limited sense coverage in the annotated datasets is a major limitation. All code and trained models are made freely available.


Addressing the MFS Bias in WSD systems
Marten Postma | Ruben Izquierdo | Eneko Agirre | German Rigau | Piek Vossen
Proceedings of the Tenth International Conference on Language Resources and Evaluation (LREC'16)

Word Sense Disambiguation (WSD) systems tend to have a strong bias towards assigning the Most Frequent Sense (MFS), which results in high performance on the MFS but in a very low performance on the less frequent senses. We addressed the MFS bias in WSD systems by combining the output from a WSD system with a set of mostly static features to create a MFS classifier to decide when to and not to choose the MFS. The output from this MFS classifier, which is based on the Random Forest algorithm, is then used to modify the output from the original WSD system. We applied our classifier to one of the state-of-the-art supervised WSD systems, i.e. IMS, and to of the best state-of-the-art unsupervised WSD systems, i.e. UKB. Our main finding is that we are able to improve the system output in terms of choosing between the MFS and the less frequent senses. When we apply the MFS classifier to fine-grained WSD, we observe an improvement on the less frequent sense cases, whereas we maintain the overall recall.

Open Dutch WordNet
Marten Postma | Emiel van Miltenburg | Roxane Segers | Anneleen Schoen | Piek Vossen
Proceedings of the 8th Global WordNet Conference (GWC)

We describe Open Dutch WordNet, which has been derived from the Cornetto database, the Princeton WordNet and open source resources. We exploited existing equivalence relations between Cornetto synsets and WordNet synsets in order to move the open source content from Cornetto into WordNet synsets. Currently, Open Dutch Wordnet contains 117,914 synsets, of which 51,588 synsets contain at least one Dutch synonym, which leaves 66,326 synsets still to obtain a Dutch synonym. The average polysemy is 1.5. The resource is currently delivered in XML under the CC BY-SA 4.0 license1 and it has been linked to the Global Wordnet Grid. In order to use the resource, we refer to: https: //

Moving away from semantic overfitting in disambiguation datasets
Marten Postma | Filip Ilievski | Piek Vossen | Marieke van Erp
Proceedings of the Workshop on Uphill Battles in Language Processing: Scaling Early Achievements to Robust Methods

Semantic overfitting: what ‘world’ do we consider when evaluating disambiguation of text?
Filip Ilievski | Marten Postma | Piek Vossen
Proceedings of COLING 2016, the 26th International Conference on Computational Linguistics: Technical Papers

Semantic text processing faces the challenge of defining the relation between lexical expressions and the world to which they make reference within a period of time. It is unclear whether the current test sets used to evaluate disambiguation tasks are representative for the full complexity considering this time-anchored relation, resulting in semantic overfitting to a specific period and the frequent phenomena within. We conceptualize and formalize a set of metrics which evaluate this complexity of datasets. We provide evidence for their applicability on five different disambiguation tasks. To challenge semantic overfitting of disambiguation systems, we propose a time-based, metric-aware method for developing datasets in a systematic and semi-automated manner, as well as an event-based QA task.

More is not always better: balancing sense distributions for all-words Word Sense Disambiguation
Marten Postma | Ruben Izquierdo Bevia | Piek Vossen
Proceedings of COLING 2016, the 26th International Conference on Computational Linguistics: Technical Papers

Current Word Sense Disambiguation systems show an extremely poor performance on low frequent senses, which is mainly caused by the difference in sense distributions between training and test data. The main focus in tackling this problem has been on acquiring more data or selecting a single predominant sense and not necessarily on the meta properties of the data itself. We demonstrate that these properties, such as the volume, provenance, and balancing, play an important role with respect to system performance. In this paper, we describe a set of experiments to analyze these meta properties in the framework of a state-of-the-art WSD system when evaluated on the SemEval-2013 English all-words dataset. We show that volume and provenance are indeed important, but that approximating the perfect balancing of the selected training data leads to an improvement of 21 points and exceeds state-of-the-art systems by 14 points while using only simple features. We therefore conclude that unsupervised acquisition of training data should be guided by strategies aimed at matching meta properties.


VUA-background : When to Use Background Information to Perform Word Sense Disambiguation
Marten Postma | Ruben Izquierdo | Piek Vossen
Proceedings of the 9th International Workshop on Semantic Evaluation (SemEval 2015)


What implementation and translation teach us: the case of semantic similarity measures in wordnets
Marten Postma | Piek Vossen
Proceedings of the Seventh Global Wordnet Conference


Offspring from Reproduction Problems: What Replication Failure Teaches Us
Antske Fokkens | Marieke van Erp | Marten Postma | Ted Pedersen | Piek Vossen | Nuno Freire
Proceedings of the 51st Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers)