Simon Dobnik

2021

pdf bib
Proceedings of the 23rd Nordic Conference on Computational Linguistics (NoDaLiDa)
Simon Dobnik | Lilja Øvrelid
Proceedings of the 23rd Nordic Conference on Computational Linguistics (NoDaLiDa)

pdf bib
Proceedings of the Reasoning and Interaction Conference (ReInAct 2021)
Christine Howes | Simon Dobnik | Ellen Breitholtz | Stergios Chatzikyriakidis
Proceedings of the Reasoning and Interaction Conference (ReInAct 2021)

pdf bib abs
Reference and coreference in situated dialogue
Sharid Loáiciga | Simon Dobnik | David Schlangen
Proceedings of the Second Workshop on Advances in Language and Vision Research

In recent years several corpora have been developed for vision and language tasks. We argue that there is still significant room for corpora that increase the complexity of both visual and linguistic domains and which capture different varieties of perceptual and conversational contexts. Working with two corpora approaching this goal, we present a linguistic perspective on some of the challenges in creating and extending resources combining language and vision while preserving continuity with the existing best practices in the area of coreference annotation.

pdf bib abs
How Vision Affects Language: Comparing Masked Self-Attention in Uni-Modal and Multi-Modal Transformer
Nikolai Ilinykh | Simon Dobnik
Proceedings of the 1st Workshop on Multimodal Semantic Representations (MMSR)

The problem of interpretation of knowledge learned by multi-head self-attention in transformers has been one of the central questions in NLP. However, a lot of work mainly focused on models trained for uni-modal tasks, e.g. machine translation. In this paper, we examine masked self-attention in a multi-modal transformer trained for the task of image captioning. In particular, we test whether the multi-modality of the task objective affects the learned attention patterns. Our visualisations of masked self-attention demonstrate that (i) it can learn general linguistic knowledge of the textual input, and (ii) its attention patterns incorporate artefacts from visual modality even though it has never accessed it directly. We compare our transformer’s attention patterns with masked attention in distilgpt-2 tested for uni-modal text generation of image captions. Based on the maps of extracted attention weights, we argue that masked self-attention in image captioning transformer seems to be enhanced with semantic knowledge from images, exemplifying joint language-and-vision information in its attention patterns.

pdf bib abs
Annotating anaphoric phenomena in situated dialogue
Sharid Loáiciga | Simon Dobnik | David Schlangen
Proceedings of the 1st Workshop on Multimodal Semantic Representations (MMSR)

In recent years several corpora have been developed for vision and language tasks. With this paper, we intend to start a discussion on the annotation of referential phenomena in situated dialogue. We argue that there is still significant room for corpora that increase the complexity of both visual and linguistic domains and which capture different varieties of perceptual and conversational contexts. In addition, a rich annotation scheme covering a broad range of referential phenomena and compatible with the textual task of coreference resolution is necessary in order to take the most advantage of these corpora. Consequently, there are several open questions regarding the semantics of reference and annotation, and the extent to which standard textual coreference accounts for the situated dialogue genre. Working with two corpora on situated dialogue, we present our extension to the ARRAU (Uryupina et al., 2020) annotation scheme in order to start this discussion.

2020

pdf bib abs
An Arabic Tweets Sentiment Analysis Dataset (ATSAD) using Distant Supervision and Self Training
Kathrein Abu Kwaik | Stergios Chatzikyriakidis | Simon Dobnik | Motaz Saad | Richard Johansson
Proceedings of the 4th Workshop on Open-Source Arabic Corpora and Processing Tools, with a Shared Task on Offensive Language Detection

As the number of social media users increases, they express their thoughts, needs, socialise and publish their opinions reviews. For good social media sentiment analysis, good quality resources are needed, and the lack of these resources is particularly evident for languages other than English, in particular Arabic. The available Arabic resources lack of from either the size of the corpus or the quality of the annotation. In this paper, we present an Arabic Sentiment Analysis Corpus collected from Twitter, which contains 36K tweets labelled into positive and negative. We employed distant supervision and self-training approaches into the corpus to annotate it. Besides, we release an 8K tweets manually annotated as a gold standard. We evaluated the corpus intrinsically by comparing it to human classification and pre-trained sentiment analysis models, Moreover, we apply extrinsic evaluation methods exploiting sentiment analysis task and achieve an accuracy of 86%.

pdf bib abs
When an Image Tells a Story: The Role of Visual and Semantic Information for Generating Paragraph Descriptions
Nikolai Ilinykh | Simon Dobnik
Proceedings of the 13th International Conference on Natural Language Generation

Generating multi-sentence image descriptions is a challenging task, which requires a good model to produce coherent and accurate paragraphs, describing salient objects in the image. We argue that multiple sources of information are beneficial when describing visual scenes with long sequences. These include (i) perceptual information and (ii) semantic (language) information about how to describe what is in the image. We also compare the effects of using two different pooling mechanisms on either a single modality or their combination. We demonstrate that the model which utilises both visual and language inputs can be used to generate accurate and diverse paragraphs when combined with a particular pooling mechanism. The results of our automatic and human evaluation show that learning to embed semantic information along with visual stimuli into the paragraph generation model is not trivial, raising a variety of proposals for future experiments.

pdf bib abs
Sky + Fire = Sunset. Exploring Parallels between Visually Grounded Metaphors and Image Classifiers
Yuri Bizzoni | Simon Dobnik
Proceedings of the Second Workshop on Figurative Language Processing

This work explores the differences and similarities between neural image classifiers’ mis-categorisations and visually grounded metaphors - that we could conceive as intentional mis-categorisations. We discuss the possibility of using automatic image classifiers to approximate human metaphoric behaviours, and the limitations of such frame. We report two pilot experiments to study grounded metaphoricity. In the first we represent metaphors as a form of visual mis-categorisation. In the second we model metaphors as a more flexible, compositional operation in a continuous visual space generated from automatic classification systems.

pdf bib abs
Fast visual grounding in interaction: bringing few-shot learning with neural networks to an interactive robot
José Miguel Cano Santín | Simon Dobnik | Mehdi Ghanimifard
Proceedings of the Probability and Meaning Conference (PaM 2020)

The major shortcomings of using neural networks with situated agents are that in incremental interaction very few learning examples are available and that their visual sensory representations are quite different from image caption datasets. In this work we adapt and evaluate a few-shot learning approach, Matching Networks (Vinyals et al., 2016), to conversational strategies of a robot interacting with a human tutor in order to efficiently learn to categorise objects that are presented to it and also investigate to what degree transfer learning from pre-trained models on images from different contexts can improve its performance. We discuss the implications of such learning on the nature of semantic representations the system has learned.

2019

pdf bib
Proceedings of the 13th International Conference on Computational Semantics - Long Papers
Simon Dobnik | Stergios Chatzikyriakidis | Vera Demberg
Proceedings of the 13th International Conference on Computational Semantics - Long Papers

pdf bib
Proceedings of the 13th International Conference on Computational Semantics - Short Papers
Simon Dobnik | Stergios Chatzikyriakidis | Vera Demberg
Proceedings of the 13th International Conference on Computational Semantics - Short Papers

pdf bib
Proceedings of the 13th International Conference on Computational Semantics - Student Papers
Simon Dobnik | Stergios Chatzikyriakidis | Vera Demberg | Kathrein Abu Kwaik | Vladislav Maraev
Proceedings of the 13th International Conference on Computational Semantics - Student Papers

pdf bib
ImageTTR: Grounding Type Theory with Records in Image Classification for Visual Question Answering
Arild Matsson | Simon Dobnik | Staffan Larsson
Proceedings of the IWCS 2019 Workshop on Computing Semantics with Types, Frames and Related Structures

pdf bib abs
What a neural language model tells us about spatial relations
Mehdi Ghanimifard | Simon Dobnik
Proceedings of the Combined Workshop on Spatial Language Understanding (SpLU) and Grounded Communication for Robotics (RoboNLP)

Understanding and generating spatial descriptions requires knowledge about what objects are related, their functional interactions, and where the objects are geometrically located. Different spatial relations have different functional and geometric bias. The wide usage of neural language models in different areas including generation of image description motivates the study of what kind of knowledge is encoded in neural language models about individual spatial relations. With the premise that the functional bias of relations is expressed in their word distributions, we construct multi-word distributional vector representations and show that these representations perform well on intrinsic semantic reasoning tasks, thus confirming our premise. A comparison of our vector representations to human semantic judgments indicates that different bias (functional or geometric) is captured in different data collection tasks which suggests that the contribution of the two meaning modalities is dynamic, related to the context of the task.

pdf bib abs
Neural Models for Detecting Binary Semantic Textual Similarity for Algerian and MSA
Wafia Adouane | Jean-Philippe Bernardy | Simon Dobnik
Proceedings of the Fourth Arabic Natural Language Processing Workshop

We explore the extent to which neural networks can learn to identify semantically equivalent sentences from a small variable dataset using an end-to-end training. We collect a new noisy non-standardised user-generated Algerian (ALG) dataset and also translate it to Modern Standard Arabic (MSA) which serves as its regularised counterpart. We compare the performance of various models on both datasets and report the best performing configurations. The results show that relatively simple models composed of 2 LSTM layers outperform by far other more sophisticated attention-based architectures, for both ALG and MSA datasets.

pdf bib
Can Modern Standard Arabic Approaches be used for Arabic Dialects? Sentiment Analysis as a Case Study
Chatrine Qwaider | Stergios Chatzikyriakidis | Simon Dobnik
Proceedings of the 3rd Workshop on Arabic Corpus Linguistics

pdf bib abs
What goes into a word: generating image descriptions with top-down spatial knowledge
Mehdi Ghanimifard | Simon Dobnik
Proceedings of the 12th International Conference on Natural Language Generation

Generating grounded image descriptions requires associating linguistic units with their corresponding visual clues. A common method is to train a decoder language model with attention mechanism over convolutional visual features. Attention weights align the stratified visual features arranged by their location with tokens, most commonly words, in the target description. However, words such as spatial relations (e.g. next to and under) are not directly referring to geometric arrangements of pixels but to complex geometric and conceptual representations. The aim of this paper is to evaluate what representations facilitate generating image descriptions with spatial relations and lead to better grounded language generation. In particular, we investigate the contribution of three different representational modalities in generating relational referring expressions: (i) pre-trained convolutional visual features, (ii) different top-down geometric relational knowledge between objects, and (iii) world knowledge captured by contextual embeddings in language models.

pdf bib abs
Normalising Non-standardised Orthography in Algerian Code-switched User-generated Data
Wafia Adouane | Jean-Philippe Bernardy | Simon Dobnik
Proceedings of the 5th Workshop on Noisy User-generated Text (W-NUT 2019)

We work with Algerian, an under-resourced non-standardised Arabic variety, for which we compile a new parallel corpus consisting of user-generated textual data matched with normalised and corrected human annotations following data-driven and our linguistically motivated standard. We use an end-to-end deep neural model designed to deal with context-dependent spelling correction and normalisation. Results indicate that a model with two CNN sub-network encoders and an LSTM decoder performs the best, and that word context matters. Additionally, pre-processing data token-by-token with an edit-distance based aligner significantly improves the performance. We get promising results for the spelling correction and normalisation, as a pre-processing step for downstream tasks, on detecting binary Semantic Textual Similarity.

2018

pdf bib abs
A Comparison of Character Neural Language Model and Bootstrapping for Language Identification in Multilingual Noisy Texts
Wafia Adouane | Simon Dobnik | Jean-Philippe Bernardy | Nasredine Semmar
Proceedings of the Second Workshop on Subword/Character LEvel Models

This paper seeks to examine the effect of including background knowledge in the form of character pre-trained neural language model (LM), and data bootstrapping to overcome the problem of unbalanced limited resources. As a test, we explore the task of language identification in mixed-language short non-edited texts with an under-resourced language, namely the case of Algerian Arabic for which both labelled and unlabelled data are limited. We compare the performance of two traditional machine learning methods and a deep neural networks (DNNs) model. The results show that overall DNNs perform better on labelled data for the majority categories and struggle with the minority ones. While the effect of the untokenised and unlabelled data encoded as LM differs for each category, bootstrapping, however, improves the performance of all systems and all categories. These methods are language independent and could be generalised to other under-resourced languages for which a small labelled data and a larger unlabelled data are available.

pdf bib abs
Exploring the Functional and Geometric Bias of Spatial Relations Using Neural Language Models
Simon Dobnik | Mehdi Ghanimifard | John Kelleher
Proceedings of the First International Workshop on Spatial Language Understanding

The challenge for computational models of spatial descriptions for situated dialogue systems is the integration of information from different modalities. The semantics of spatial descriptions are grounded in at least two sources of information: (i) a geometric representation of space and (ii) the functional interaction of related objects that. We train several neural language models on descriptions of scenes from a dataset of image captions and examine whether the functional or geometric bias of spatial descriptions reported in the literature is reflected in the estimated perplexity of these models. The results of these experiments have implications for the creation of models of spatial lexical semantics for human-robot dialogue systems. Furthermore, they also provide an insight into the kinds of the semantic knowledge captured by neural language models trained on spatial descriptions, which has implications for image captioning systems.

pdf bib abs
Improving Neural Network Performance by Injecting Background Knowledge: Detecting Code-switching and Borrowing in Algerian texts
Wafia Adouane | Jean-Philippe Bernardy | Simon Dobnik
Proceedings of the Third Workshop on Computational Approaches to Linguistic Code-Switching

We explore the effect of injecting background knowledge to different deep neural network (DNN) configurations in order to mitigate the problem of the scarcity of annotated data when applying these models on datasets of low-resourced languages. The background knowledge is encoded in the form of lexicons and pre-trained sub-word embeddings. The DNN models are evaluated on the task of detecting code-switching and borrowing points in non-standardised user-generated Algerian texts. Overall results show that DNNs benefit from adding background knowledge. However, the gain varies between models and categories. The proposed DNN architectures are generic and could be applied to other low-resourced languages.

pdf bib
Shami: A Corpus of Levantine Arabic Dialects
Kathrein Abu Kwaik | Motaz Saad | Stergios Chatzikyriakidis | Simon Dobnik
Proceedings of the Eleventh International Conference on Language Resources and Evaluation (LREC 2018)

2017

pdf bib
KILLE: a Framework for Situated Agents for Learning Language Through Interaction
Simon Dobnik | Erik de Graaf
Proceedings of the 21st Nordic Conference on Computational Linguistics

pdf bib abs
Identification of Languages in Algerian Arabic Multilingual Documents
Wafia Adouane | Simon Dobnik
Proceedings of the Third Arabic Natural Language Processing Workshop

This paper presents a language identification system designed to detect the language of each word, in its context, in a multilingual documents as generated in social media by bilingual/multilingual communities, in our case speakers of Algerian Arabic. We frame the task as a sequence tagging problem and use supervised machine learning with standard methods like HMM and Ngram classification tagging. We also experiment with a lexicon-based method. Combining all the methods in a fall-back mechanism and introducing some linguistic rules, to deal with unseen tokens and ambiguous words, gives an overall accuracy of 93.14%. Finally, we introduced rules for language identification from sequences of recognised words.

pdf bib
Learning to Compose Spatial Relations with Grounded Neural Language Models
Mehdi Ghanimifard | Simon Dobnik
IWCS 2017 - 12th International Conference on Computational Semantics - Long papers

pdf bib
An overview of Natural Language Inference Data Collection: The way forward?
Stergios Chatzikyriakidis | Robin Cooper | Simon Dobnik | Staffan Larsson
Proceedings of the Computing Natural Language Inference Workshop

2015

pdf bib abs
Probabilistic Type Theory and Natural Language Semantics
Robin Cooper | Simon Dobnik | Shalom Lappin | Staffan Larsson
Linguistic Issues in Language Technology, Volume 10, 2015

Type theory has played an important role in specifying the formal connection between syntactic structure and semantic interpretation within the history of formal semantics. In recent years rich type theories developed for the semantics of programming languages have become influential in the semantics of natural language. The use of probabilistic reasoning to model human learning and cognition has become an increasingly important part of cognitive science. In this paper we offer a probabilistic formulation of a rich type theory, Type Theory with Records (TTR), and we illustrate how this framework can be used to approach the problem of semantic learning. Our probabilistic version of TTR is intended to provide an interface between the cognitive process of classifying situations according to the types that they instantiate, and the compositional semantics of natural language.

Simon Dobnik

2021

2020

2019

2018

2017

2015

2014

2013

Co-authors

Venues