Jonathan Ginzburg

2023

pdf abs
TTR at the SPA: Relating type-theoretical semantics to neural semantic pointers
Staffan Larsson | Robin Cooper | Jonathan Ginzburg | Andy Luecking
Proceedings of the 4th Natural Logic Meets Machine Learning Workshop

This paper considers how the kind of formal semantic objects used in TTR (a theory of types with records, Cooper 2013) might be related to the vector representations used in Eliasmith (2013). An advantage of doing this is that it would immediately give us a neural representation for TTR objects as Eliasmith relates vectors to neural activity in his semantic pointer architecture (SPA). This would be an alternative using convolution to the suggestions made by Cooper (2019) based on the phasing of neural activity. The project seems potentially hopeful since all complex TTR objects are constructed from labelled sets (essentially sets of ordered pairs consisting of labels and values) which might be seen as corresponding to the representation of structured objects which Eliasmith achieves using superposition and circular convolution.

pdf bib abs
Referential Transparency and Inquisitiveness
Jonathan Ginzburg | Andy Lücking
Proceedings of the 4th Workshop on Inquisitiveness Below and Beyond the Sentence Boundary

The paper extends a referentially transparent approach which has been successfully applied to the analysis of declarative quantified NPs to wh-phrases. This uses data from dialogical phenomena such as clarification interaction, anaphora, and incrementality as a guide to the design of wh-phrase meanings.

pdf abs
Unravelling Indirect Answers to Wh-Questions: Corpus Construction, Analysis, and Generation
Zulipiye Yusupujiang | Jonathan Ginzburg
Proceedings of the 24th Annual Meeting of the Special Interest Group on Discourse and Dialogue

Indirect answers, crucial in human communication, serve to maintain politeness, avoid conflicts, and align with social customs. Although there has been a substantial number of studies on recognizing and understanding indirect answers to polar questions (often known as yes/no questions), there is a dearth of such work regarding wh-questions. This study takes up the challenge by constructing what is, to our knowledge, the first corpus of indirect answers to wh-questions. We analyze and interpret indirect answers to different wh-questions based on our carefully compiled corpus. In addition, we conducted a pilot study on generating indirect answers to wh-questions by fine-tuning the pre-trained generative language model DialoGPT (Zhang et al., 2020). Our results suggest this is a task that GPT finds difficult.

2022

pdf abs
UgChDial: A Uyghur Chat-based Dialogue Corpus for Response Space Classification
Zulipiye Yusupujiang | Jonathan Ginzburg
Proceedings of the Thirteenth Language Resources and Evaluation Conference

In this paper, we introduce a carefully designed and collected language resource: UgChDial – a Uyghur dialogue corpus based on a chatroom environment. The Uyghur Chat-based Dialogue Corpus (UgChDial) is divided into two parts: (1). Two-party dialogues and (2). Multi-party dialogues. We ran a series of 25, 120-minutes each, two-party chat sessions, totaling 7323 turns and 1581 question-response pairs. We created 16 different scenarios and topics to gather these two-party conversations. The multi-party conversations were compiled from chitchats in general channels as well as free chats in topic-oriented public channels, yielding 5588 unique turns and 838 question-response pairs. The initial purpose of this corpus is to study query-response pairs in Uyghur, building on an existing fine-grained response space taxonomy for English. We provide here initial annotation results on the Uyghur response space classification task using UgChDial.

2021

pdf abs
Requesting clarifications with speech and gestures
Jonathan Ginzburg | Andy Luecking
Proceedings of the 1st Workshop on Multimodal Semantic Representations (MMSR)

In multimodal natural language interaction both speech and non-speech gestures are involved in the basic mechanism of grounding and repair. We discuss a couple of multimodal clarifica- tion requests and argue that gestures, as well as speech expressions, underlie comparable paral- lelism constraints. In order to make this precise, we slightly extend the formal dialogue frame- work KoS to cover also gestural counterparts of verbal locutionary propositions.

2020

pdf
Dialogue management with linear logic: the role of metavariables in questions and clarifications
Vladislav Maraev | Jean-Philippe Bernardy | Jonathan Ginzburg
Traitement Automatique des Langues, Volume 61, Numéro 3 : Dialogue et systèmes de dialogue [Dialogue and dialogue systems]

pdf abs
Designing a GWAP for Collecting Naturally Produced Dialogues for Low Resourced Languages
Zulipiye Yusupujiang | Jonathan Ginzburg
Workshop on Games and Natural Language Processing

In this paper we present a new method for collecting naturally generated dialogue data for a low resourced language, (specifically here—Uyghur). We plan to build a games with a purpose (GWAPs) to encourage native speakers to actively contribute dialogue data to our research project. Since we aim to characterize the response space of queries in Uyghur, we design various scenarios for conversations that yield to questions being posed and responded to. We will implement the GWAP with the RPG Maker MV Game Engine, and will integrate the chatroom system in the game with the Dialogue Experimental Toolkit (DiET). DiET will help us improve the data collection process, and most importantly, make us have some control over the interactions among the participants.

2019

pdf bib abs
Distribution is not enough: going Firther
Andy Lücking | Robin Cooper | Staffan Larsson | Jonathan Ginzburg
Proceedings of the Sixth Workshop on Natural Language and Computer Science

Much work in contemporary computational semantics follows the distributional hypothesis (DH), which is understood as an approach to semantics according to which the meaning of a word is a function of its distribution over contexts which is represented as vectors (word embeddings) within a multi-dimensional semantic space. In practice, use is identified with occurrence in text corpora, though there are some efforts to use corpora containing multi-modal information. In this paper we argue that the distributional hypothesis is intrinsically misguided as a self-supporting basis for semantics, as Firth was entirely aware. We mention philosophical arguments concerning the lack of normativity within DH data. Furthermore, we point out the shortcomings of DH as a model of learning, by discussing a variety of linguistic classes that cannot be learnt on a distributional basis, including indexicals, proper names, and wh-phrases. Instead of pursuing DH, we sketch an account of the problematic learning cases by integrating a rich, Firthian notion of dialogue context with interactive learning in signalling games backed by in probabilistic Type Theory with Records. We conclude that the success of the DH in computational semantics rests on a post hoc effect: DS presupposes a referential semantics on the basis of which utterances can be produced, comprehended and analysed in the first place.

pdf abs
Characterizing the Response Space of Questions: a Corpus Study for English and Polish
Jonathan Ginzburg | Zulipiye Yusupujiang | Chuyuan Li | Kexin Ren | Paweł Łupkowski
Proceedings of the 20th Annual SIGdial Meeting on Discourse and Dialogue

The main aim of this paper is to provide a characterization of the response space for questions using a taxonomy grounded in a dialogical formal semantics. As a starting point we take the typology for responses in the form of questions provided in (Lupkowski and Ginzburg, 2016). This work develops a wide coverage taxonomy for question/question sequences observable in corpora including the BNC, CHILDES, and BEE, as well as formal modelling of all the postulated classes. Our aim is to extend this work to cover all responses to questions. We present the extended typology of responses to questions based on a corpus studies of BNC, BEE and Maptask with include 506, 262, and 467 question/response pairs respectively. We compare the data for English with data from Polish using the Spokes corpus (205 question/response pairs). We discuss annotation reliability and disagreement analysis. We sketch how each class can be formalized using a dialogical semantics appropriate for dialogue management.

2016

pdf abs
DUEL: A Multi-lingual Multimodal Dialogue Corpus for Disfluency, Exclamations and Laughter
Julian Hough | Ye Tian | Laura de Ruiter | Simon Betz | Spyros Kousidis | David Schlangen | Jonathan Ginzburg
Proceedings of the Tenth International Conference on Language Resources and Evaluation (LREC'16)

We present the DUEL corpus, consisting of 24 hours of natural, face-to-face, loosely task-directed dialogue in German, French and Mandarin Chinese. The corpus is uniquely positioned as a cross-linguistic, multimodal dialogue resource controlled for domain. DUEL includes audio, video and body tracking data and is transcribed and annotated for disfluency, laughter and exclamations.

pdf
When do we laugh?
Ye Tian | Chiara Mazzocconi | Jonathan Ginzburg
Proceedings of the 17th Annual Meeting of the Special Interest Group on Discourse and Dialogue