Sander Wubben

2020

pdf abs
The CACAPO Dataset: A Multilingual, Multi-Domain Dataset for Neural Pipeline and End-to-End Data-to-Text Generation
Chris van der Lee | Chris Emmery | Sander Wubben | Emiel Krahmer
Proceedings of the 13th International Conference on Natural Language Generation

This paper describes the CACAPO dataset, built for training both neural pipeline and end-to-end data-to-text language generation systems. The dataset is multilingual (Dutch and English), and contains almost 10,000 sentences from human-written news texts in the sports, weather, stocks, and incidents domain, together with aligned attribute-value paired data. The dataset is unique in that the linguistic variation and indirect ways of expressing data in these texts reflect the challenges of real world NLG tasks.

2019

pdf abs
Best practices for the human evaluation of automatically generated text
Chris van der Lee | Albert Gatt | Emiel van Miltenburg | Sander Wubben | Emiel Krahmer
Proceedings of the 12th International Conference on Natural Language Generation

Currently, there is little agreement as to how Natural Language Generation (NLG) systems should be evaluated. While there is some agreement regarding automatic metrics, there is a high degree of variation in the way that human evaluation is carried out. This paper provides an overview of how human evaluation is currently conducted, and presents a set of best practices, grounded in the literature. With this paper, we hope to contribute to the quality and consistency of human evaluations in NLG.

2018

pdf abs
Evaluating the text quality, human likeness and tailoring component of PASS: A Dutch data-to-text system for soccer
Chris van der Lee | Bart Verduijn | Emiel Krahmer | Sander Wubben
Proceedings of the 27th International Conference on Computational Linguistics

We present an evaluation of PASS, a data-to-text system that generates Dutch soccer reports from match statistics which are automatically tailored towards fans of one club or the other. The evaluation in this paper consists of two studies. An intrinsic human-based evaluation of the system’s output is described in the first study. In this study it was found that compared to human-written texts, computer-generated texts were rated slightly lower on style-related text components (fluency and clarity) and slightly higher in terms of the correctness of given information. Furthermore, results from the first study showed that tailoring was accurately recognized in most cases, and that participants struggled with correctly identifying whether a text was written by a human or computer. The second study investigated if tailoring affects perceived text quality, for which no results were garnered. This lack of results might be due to negative preconceptions about computer-generated texts which were found in the first study.

pdf abs
Aspect-based summarization of pros and cons in unstructured product reviews
Florian Kunneman | Sander Wubben | Antal van den Bosch | Emiel Krahmer
Proceedings of the 27th International Conference on Computational Linguistics

We developed three systems for generating pros and cons summaries of product reviews. Automating this task eases the writing of product reviews, and offers readers quick access to the most important information. We compared SynPat, a system based on syntactic phrases selected on the basis of valence scores, against a neural-network-based system trained to map bag-of-words representations of reviews directly to pros and cons, and the same neural system trained on clusters of word-embedding encodings of similar pros and cons. We evaluated the systems in two ways: first on held-out reviews with gold-standard pros and cons, and second by asking human annotators to rate the systems’ output on relevance and completeness. In the second evaluation, the gold-standard pros and cons were assessed along with the system output. We find that the human-generated summaries are not deemed as significantly more relevant or complete than the SynPat systems; the latter are scored higher than the human-generated summaries on a precision metric. The neural approaches yield a lower performance in the human assessment, and are outperformed by the baseline.

pdf abs
NeuralREG: An end-to-end approach to referring expression generation
Thiago Castro Ferreira | Diego Moussallem | Ákos Kádár | Sander Wubben | Emiel Krahmer
Proceedings of the 56th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers)

Traditionally, Referring Expression Generation (REG) models first decide on the form and then on the content of references to discourse entities in text, typically relying on features such as salience and grammatical function. In this paper, we present a new approach (NeuralREG), relying on deep neural networks, which makes decisions about form and content in one go without explicit feature extraction. Using a delexicalized version of the WebNLG corpus, we show that the neural model substantially improves over two strong baselines.

pdf abs
Surface Realization Shared Task 2018 (SR18): The Tilburg University Approach
Thiago Castro Ferreira | Sander Wubben | Emiel Krahmer
Proceedings of the First Workshop on Multilingual Surface Realisation

This study describes the approach developed by the Tilburg University team to the shallow task of the Multilingual Surface Realization Shared Task 2018 (SR18). Based on (Castro Ferreira et al., 2017), the approach works by first preprocessing an input dependency tree into an ordered linearized string, which is then realized using a statistical machine translation model. Our approach shows promising results, with BLEU scores above 50 for 5 different languages (English, French, Italian, Portuguese and Spanish) and above 35 for the Dutch language.

pdf abs
Automated learning of templates for data-to-text generation: comparing rule-based, statistical and neural methods
Chris van der Lee | Emiel Krahmer | Sander Wubben
Proceedings of the 11th International Conference on Natural Language Generation

The current study investigated novel techniques and methods for trainable approaches to data-to-text generation. Neural Machine Translation was explored for the conversion from data to text as well as the addition of extra templatization steps of the data input and text output in the conversion process. Evaluation using BLEU did not find the Neural Machine Translation technique to perform any better compared to rule-based or Statistical Machine Translation, and the templatization method seemed to perform similarly or sometimes worse compared to direct data-to-text conversion. However, the human evaluation metrics indicated that Neural Machine Translation yielded the highest quality output and that the templatization method was able to increase text quality in multiple situations.

pdf abs
Enriching the WebNLG corpus
Thiago Castro Ferreira | Diego Moussallem | Emiel Krahmer | Sander Wubben
Proceedings of the 11th International Conference on Natural Language Generation

This paper describes the enrichment of WebNLG corpus (Gardent et al., 2017a,b), with the aim to further extend its usefulness as a resource for evaluating common NLG tasks, including Discourse Ordering, Lexicalization and Referring Expression Generation. We also produce a silver-standard German translation of the corpus to enable the exploitation of NLG approaches to other languages than English. The enriched corpus is publicly available.

pdf bib
Applications of NLG in practical conversational AI settings
Sander Wubben
Proceedings of the Workshop on Intelligent Interactive Systems and Language Generation (2IS&NLG)

2017

pdf bib abs
Linguistic realisation as machine translation: Comparing different MT models for AMR-to-text generation
Thiago Castro Ferreira | Iacer Calixto | Sander Wubben | Emiel Krahmer
Proceedings of the 10th International Conference on Natural Language Generation

In this paper, we study AMR-to-text generation, framing it as a translation task and comparing two different MT approaches (Phrase-based and Neural MT). We systematically study the effects of 3 AMR preprocessing steps (Delexicalisation, Compression, and Linearisation) applied before the MT phase. Our results show that preprocessing indeed helps, although the benefits differ for the two MT models.

pdf abs
PASS: A Dutch data-to-text system for soccer, targeted towards specific audiences
Chris van der Lee | Emiel Krahmer | Sander Wubben
Proceedings of the 10th International Conference on Natural Language Generation

We present PASS, a data-to-text system that generates Dutch soccer reports from match statistics. One of the novel elements of PASS is the fact that the system produces corpus-based texts tailored towards fans of one club or the other, which can most prominently be observed in the tone of voice used in the reports. Furthermore, the system is open source and uses a modular design, which makes it relatively easy for people to add extensions. Human-based evaluation shows that people are generally positive towards PASS in regards to its clarity and fluency, and that the tailoring is accurately recognized in most cases.

pdf abs
Generating flexible proper name references in text: Data, models and evaluation
Thiago Castro Ferreira | Emiel Krahmer | Sander Wubben
Proceedings of the 15th Conference of the European Chapter of the Association for Computational Linguistics: Volume 1, Long Papers

This study introduces a statistical model able to generate variations of a proper name by taking into account the person to be mentioned, the discourse context and variation. The model relies on the REGnames corpus, a dataset with 53,102 proper name references to 1,000 people in different discourse contexts. We evaluate the versions of our model from the perspective of how human writers produce proper names, and also how human readers process them. The corpus and the model are publicly available.

2016

pdf
Towards more variation in text generation: Developing and evaluating variation models for choice of referential form
Thiago Castro Ferreira | Emiel Krahmer | Sander Wubben
Proceedings of the 54th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers)

pdf
Individual Variation in the Choice of Referential Form
Thiago Castro Ferreira | Emiel Krahmer | Sander Wubben
Proceedings of the 2016 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies

pdf abs
SatiricLR: a Language Resource of Satirical News Articles
Alice Frain | Sander Wubben
Proceedings of the Tenth International Conference on Language Resources and Evaluation (LREC'16)

In this paper we introduce the Satirical Language Resource: a dataset containing a balanced collection of satirical and non satirical news texts from various domains. This is the first dataset of this magnitude and scope in the domain of satire. We envision this dataset will facilitate studies on various aspects of of sat- ire in news articles. We test the viability of our data on the task of classification of satire.

pdf
Abstractive Compression of Captions with Attentive Recurrent Neural Networks
Sander Wubben | Emiel Krahmer | Antal van den Bosch | Suzan Verberne
Proceedings of the 9th International Natural Language Generation conference

pdf
Towards proper name generation: a corpus analysis
Thiago Castro Ferreira | Sander Wubben | Emiel Krahmer
Proceedings of the 9th International Natural Language Generation conference

2015

pdf
Predicting Ratings for New Movie Releases from Twitter Content
Wernard Schmit | Sander Wubben
Proceedings of the 6th Workshop on Computational Approaches to Subjectivity, Sentiment and Social Media Analysis

2014

pdf abs
Creating and using large monolingual parallel corpora for sentential paraphrase generation
Sander Wubben | Antal van den Bosch | Emiel Krahmer
Proceedings of the Ninth International Conference on Language Resources and Evaluation (LREC'14)

In this paper we investigate the automatic generation of paraphrases by using machine translation techniques. Three contributions we make are the construction of a large paraphrase corpus for English and Dutch, a re-ranking heuristic to use machine translation for paraphrase generation and a proper evaluation methodology. A large parallel corpus is constructed by aligning clustered headlines that are scraped from a news aggregator site. To generate sentential paraphrases we use a standard phrase-based machine translation (PBMT) framework modified with a re-ranking component (henceforth PBMT-R). We demonstrate this approach for Dutch and English and evaluate by using human judgements collected from 76 participants. The judgments are compared to two automatic machine translation evaluation metrics. We observe that as the paraphrases deviate more from the source sentence, the performance of the PBMT-R system degrades less than that of the word substitution baseline system.

Sander Wubben

2020

2019

2018

2017

2016

2015

2014

2013

2012

2011

2010

2009

Co-authors

Venues