Proceedings of the Probability and Meaning Conference (PaM 2020)

Christine Howes, Stergios Chatzikyriakidis, Adam Ek, Vidya Somashekarappa (Editors)

Anthology ID:
Association for Computational Linguistics
Bib Export formats:

pdf bib
Proceedings of the Probability and Meaning Conference (PaM 2020)
Christine Howes | Stergios Chatzikyriakidis | Adam Ek | Vidya Somashekarappa

pdf bib
‘Practical’, if that’s the word
Eimear Maguire

Certain conditionals have something other than a clause as their consequent: their antecedent if-clauses are ‘adverbial clauses’ without a verb. We argue that they function in a way already seen for those with clausal consequents, despite lacking the content we might expect for the formation of a conditional. The use of the if-clause with sub-clausal consequents is feasible thanks to the fact that this function does not depend on the consequent content, and so is not impeded when the consequent does not provide a proposition, question or imperative. To support this we provide meaning rules for conditionals in terms of information state updates, letting the same construction play out in different ways depending on context and content.

pdf bib
Personae under uncertainty: The case of topoi
Bill Noble | Ellen Breitholtz | Robin Cooper

In this paper, we propose a probabilistic model of social signalling which adopts a persona-based account of social meaning. We use this model to develop a socio-semantic theory of conventionalised reasoning patterns, known as topoi. On this account the social meaning of a topos, as conveyed in a argument, is based on the set of idealogically-related topoi it indicates in context. We draw a connection between the role of personae in social meaning and the category adjustment effect, a well-known psychological phenomenon in which the representation of a stimulus is biased in the direction of the category in which it falls. Finally, we situate the interpretation of social signals as an update to the information state of an agent in a formal TTR model of dialogue.

Dogwhistles as Identity-based interpretative variation
Quentin Dénigot | Heather Burnett

The following paper presents a formal model for the description of dogwhistles. Dogwhistles are a class of terms or expressions often used in political discourse that are used with the goal of being interpreted in different ways by different communities. The model presented here describes this phenomenon using a variation on the Social Meaning Games framework that uses probability distributions over possible interpretation functions as well as RSA/IBR reasoning.

Conditional answers and the role of probabilistic epistemic representations
Jos Tellings

Conditional utterances can be used in discourse as answers to regular, non-conditional questions in situations of partial knowledge of the answerer. We claim that the probabilities assigned to possible epistemic states of A are a measure of the utility of conditional answers. A second criterion that makes a conditional answer ‘if p, then q’ relevant has to do with the dependency between p and q that is conveyed in the statement. A conditional answer counts as relevant when this dependency leads the question asker to shift from a decision problem about q to an alternative, easier, decision problem about p.

Linguistic interpretation as inference under argument system uncertainty: the case of epistemic must
Brandon Waldon

Modern semantic analyses of epistemic language (incl. the modals must and might) can be characterized by the following ‘credence assumption’: speakers have full certainty regarding the propositions that structure their epistemic state. Intuitively, however: a) speakers have graded, rather than categorical, commitment to these propositions, which are often never fully and explicitly articulated; b) listeners have higher-order uncertainty about this speaker uncertainty; c) must p is used to communicate speaker commitment to some conclusion p and to indicate speaker commitment to the premises that condition the conclusion. I explore the consequences of relaxing the credence assumption by extending the argument system semantic framework first proposed by Stone (1994) to a Bayesian probabilistic framework of modeling pragmatic interpretation (Goodman and Frank, 2016). The analysis makes desirable predictions regarding the behavior and interpretation of must, and it suggests a new way of considering the nature of context and communicative exchange.

Linguists Who Use Probabilistic Models Love Them: Quantification in Functional Distributional Semantics
Guy Emerson

Functional Distributional Semantics provides a computationally tractable framework for learning truth-conditional semantics from a corpus. Previous work in this framework has provided a probabilistic version of first-order logic, recasting quantification as Bayesian inference. In this paper, I show how the previous formulation gives trivial truth values when a precise quantifier is used with vague predicates. I propose an improved account, avoiding this problem by treating a vague predicate as a distribution over precise predicates. I connect this account to recent work in the Rational Speech Acts framework on modelling generic quantification, and I extend this to modelling donkey sentences. Finally, I explain how the generic quantifier can be both pragmatically complex and yet computationally simpler than precise quantifiers.

Fast visual grounding in interaction: bringing few-shot learning with neural networks to an interactive robot
José Miguel Cano Santín | Simon Dobnik | Mehdi Ghanimifard

The major shortcomings of using neural networks with situated agents are that in incremental interaction very few learning examples are available and that their visual sensory representations are quite different from image caption datasets. In this work we adapt and evaluate a few-shot learning approach, Matching Networks (Vinyals et al., 2016), to conversational strategies of a robot interacting with a human tutor in order to efficiently learn to categorise objects that are presented to it and also investigate to what degree transfer learning from pre-trained models on images from different contexts can improve its performance. We discuss the implications of such learning on the nature of semantic representations the system has learned.

Discrete and Probabilistic Classifier-based Semantics
Staffan Larsson

We present a formal semantics (a version of Type Theory with Records) which places classifiers of perceptual information at the core of semantics. Using this framework, we present an account of the interpretation and classification of utterances referring to perceptually available situations (such as visual scenes). The account improves on previous work by clarifying the role of classifiers in a hybrid semantics combining statistical/neural classifiers with logical/inferential aspects of meaning. The account covers both discrete and probabilistic classification, thereby enabling learning, vagueness and other non-discrete linguistic phenomena.

Social Meaning in Repeated Interactions
Elin McCready | Robert Henderson

Judgements about communicative agents evolve over the course of interactions both in how individuals are judged for testimonial reliability and for (ideological) trustworthiness. This paper combines a theory of social meaning and persona with a theory of reliability within a game-theoretic view of communication, giving a formal model involving interactional histories, repeated game models and ways of evaluating social meaning and trustworthiness.

Towards functional, agent-based models of dogwhistle communication
Robert Henderson | Elin McCready

Henderson and McCready 2017, 2018, 2019 build a novel theory of so-called ‘dogwhistle’ communication by extending the social meaning games of Burnett 2017. This work reports on an ongoing project to build systems to model the evolution of dogwhistle communication in a population based on probability monads (Erwig and Kollmansberger, 2006; Kidd, 2007). The ultimate results will be useful not just for dogwhistles, but modeling the diffusion and evolution of social meaning in populations in general. The initial results presented here is a computational implementation of Henderson and McCready 2018, which will serve as the basis for models with multiple speakers and repeated interactions.

Stochastic Frames
Annika Schuster | Corina Stroessner | Peter Sutton | Henk Zeevat

In the frame hypothesis (CITATION), human concepts are equated with frames, which extend feature lists by a functional structure consisting of attributes and values. For example, a bachelor is represented by the attributes gender and marital status and their values ‘male’ and ‘unwed’. This paper makes the point that for many applications of concepts in cognition, including for concepts to be associated with lexemes in natural languages, the right structures to assume are not merely frames but stochastic frames in which attributes are associated with probability distributions over values. The paper introduces the idea of stochastic frames and suggests three applications: vagueness, ambiguity, and typicality.

A toy distributional model for fuzzy generalised quantifiers
Mehrnoosh Sadrzadeh | Gijs Wijnholds

Recent work in compositional distributional semantics showed how bialgebras model generalised quantifiers of natural language. That technique requires working with vector space over power sets of bases, and therefore is computationally costly. It is possible to overcome the computational hurdles by working with fuzzy generalised quantifiers. In this paper, we show that the compositional notion of semantics of natural language, guided by a grammar, extends from a binary to a many valued setting and instantiate in it the fuzzy computations. We import vector representations of words and predicates, learnt from large scale compositional distributional semantics, interpret them as fuzzy sets, and analyse their performance on a toy inference dataset.

Generating Lexical Representations of Frames using Lexical Substitution
Saba Anwar | Artem Shelmanov | Alexander Panchenko | Chris Biemann

Semantic frames are formal linguistic structures describing situations/actions/events, e.g. Commercial transfer of goods. Each frame provides a set of roles corresponding to the situation participants, e.g. Buyer and Goods, and lexical units (LUs) – words and phrases that can evoke this particular frame in texts, e.g. Sell. The scarcity of annotated resources hinders wider adoption of frame semantics across languages and domains. We investigate a simple yet effective method, lexical substitution with word representation models, to automatically expand a small set of frame-annotated sentences with new words for their respective roles and LUs. We evaluate the expansion quality using FrameNet. Contextualized models demonstrate overall superior performance compared to the non-contextualized ones on roles. However, the latter show comparable performance on the task of LU expansion.

Informativity in Image Captions vs. Referring Expressions
Elizabeth Coppock | Danielle Dionne | Nathanial Graham | Elias Ganem | Shijie Zhao | Shawn Lin | Wenxing Liu | Derry Wijaya

At the intersection between computer vision and natural language processing, there has been recent progress on two natural language generation tasks: Dense Image Captioning and Referring Expression Generation for objects in complex scenes. The former aims to provide a caption for a specified object in a complex scene for the benefit of an interlocutor who may not be able to see it. The latter aims to produce a referring expression that will serve to identify a given object in a scene that the interlocutor can see. The two tasks are designed for different assumptions about the common ground between the interlocutors, and serve very different purposes, although they both associate a linguistic description with an object in a complex scene. Despite these fundamental differences, the distinction between these two tasks is sometimes overlooked. Here, we undertake a side-by-side comparison between image captioning and reference game human datasets and show that they differ systematically with respect to informativity. We hope that an understanding of the systematic differences among these human datasets will ultimately allow them to be leveraged more effectively in the associated engineering tasks.

How does Punctuation Affect Neural Models in Natural Language Inference
Adam Ek | Jean-Philippe Bernardy | Stergios Chatzikyriakidis

Natural Language Inference models have reached almost human-level performance but their generalisation capabilities have not been yet fully characterized. In particular, sensitivity to small changes in the data is a current area of investigation. In this paper, we focus on the effect of punctuation on such models. Our findings can be broadly summarized as follows: (1) irrelevant changes in punctuation are correctly ignored by the recent transformer models (BERT) while older RNN-based models were sensitive to them. (2) All models, both transformers and RNN-based models, are incapable of taking into account small relevant changes in the punctuation.

Building a Swedish Question-Answering Model
Hannes von Essen | Daniel Hesslow

High quality datasets for question answering exist in a few languages, but far from all. Producing such datasets for new languages requires extensive manual labour. In this work we look at different methods for using existing datasets to train question-answering models in languages lacking such datasets. We show that machine translation followed by cross-lingual projection is a viable way to create a full question-answering dataset in a new language. We introduce new methods both for bitext alignment, using optimal transport, and for direct cross-lingual projection, utilizing multilingual BERT. We show that our methods produce good Swedish question-answering models without any manual work. Finally, we apply our proposed methods on Spanish and evaluate it on the XQuAD and MLQA benchmarks where we achieve new state-of-the-art values of 80.4 F1 and 62.9 Exact Match (EM) points on the Spanish XQuAD corpus and 70.8 F1 and 53.0 EM on the Spanish MLQA corpus, showing that the technique is readily applicable to other languages.

Word Sense Distance in Human Similarity Judgements and Contextualised Word Embeddings
Janosch Haber | Massimo Poesio

Homonymy is often used to showcase one of the advantages of context-sensitive word embedding techniques such as ELMo and BERT. In this paper we want to shift the focus to the related but less exhaustively explored phenomenon of polysemy, where a word expresses various distinct but related senses in different contexts. Specifically, we aim to i) investigate a recent model of polyseme sense clustering proposed by Ortega-Andres & Vicente (2019) through analysing empirical evidence of word sense grouping in human similarity judgements, ii) extend the evaluation of context-sensitive word embedding systems by examining whether they encode differences in word sense similarity and iii) compare the word sense similarities of both methods to assess their correlation and gain some intuition as to how well contextualised word embeddings could be used as surrogate word sense similarity judgements in linguistic experiments.

Short-term Semantic Shifts and their Relation to Frequency Change
Anna Marakasova | Julia Neidhardt

We present ongoing research on the relationship between short-term semantic shifts and frequency change patterns by examining the case of the refugee crisis in Austria from 2015 to 2016. Our experiments are carried out on a diachronic corpus of Austrian German, namely a corpus of newspaper articles. We trace the evolution of the usage of words that represent concepts in the context of the refugee crisis by analyzing cosine similarities of word vectors over time as well as similarities based on the words’ nearest neighbourhood sets. In order to investigate how exactly the contextual meanings have changed, we measure cosine similarity between the following pairs of words: words describing the refugee crisis, on the one hand, and words indicating the process of mediatization and politicization of the refugee crisis in Austria proposed by a domain expert, on the other hand. We evaluate our approach against the expert knowledge. The paper presents the current findings and outlines the directions of the future work.