Dialogue & Discourse (2011)


Volumes

up

bib (full) Dialogue Discourse Volume 2

In this study, maternal input was analyzed during a task, in which German mothers instructed their two-year-old children to put two objects together in a particular way. In the setting, the spatial relation (ON and UNDER) and the canonicality of these relations (canonical such as ‘a pot on a table’ and noncanonical like ‘a train on a tunnel’) were varied. Two kinds of discourse strategies are proposed that characterize mothers’ input in this task: bring-in and follow-in. For the analysis, an automatic procedure was developed, in which the amount of words spent on a strategy was related to the overall word amount. The data suggest that the canonicality of the task can change the discourse: Bring-in strategies dominated the discourse in tasks with canonical spatial relations while in more difficult tasks with non-canonical relations, German-speaking mothers used follow-ins significantly more often than in the canonical tasks. Together, the results of this study shed light on the process of an on-line adaptation of the mother to her child and give us insight into how a situated understanding in a task-oriented discourse emerges.
From two corpus studies into varieties of clausal coordination in English (Meyer, 1995 and Greenbaum & Nelson, 1999), it is known that the incidence of clausal coordinate ellipsis (CCE) is about two times higher in written than in spoken language. We present a treebank study into CCE in written and spoken Dutch and German which confirms this tendency. Moreover, we observe considerable differences between written and spoken language with respect to the incidence of four main types of clausal coordinate ellipsis—Gapping, Forward Conjunction Reduction (FCR), Backward Conjunction Reduction (BCR), and Subject Gap with Finite/Fronted Verb (SGF). We argue that the detailed data pattern cannot be accounted for in terms of audience design, and propose an explanation based on the assumption that during spontaneous speaking—but not during writing—, the scope of online grammatical planning is basically restricted to one (finite) clause.
Spoken contributions in dialogue often continue or complete earlier contributions by either the same or a different speaker. These compound contributions (CCs) thus provide a natural context for investigations of incremental processing in dialogue.We present a corpus study which confirms that CCs are a key dialogue phenomenon: almost 20% of contributions fit our general definition of CCs, with nearly 3% being the cross-person case most often studied. The results suggest that processing is word-by-word incremental, as splits can occur within syntactic ‘constituents’; however, some systematic differences between same- and cross-person cases indicate important dialogue-specific pragmatic effects. An experimental study then investigates these effects by artificially introducing CCs into multi-party text dialogue. Results suggest that CCs affect people’s expectations about who will speak next and whether other participants have formed a coalition or ‘party’.Together, these studies suggest that CCs require an incremental processing mechanism that can provide a resource for constructing linguistic constituents that span multiple contributions and multiple participants. They also suggest the need to model higher-level dialogue units that have consequences for the organization of turn-taking and for the development of a shared context.
Three experiments (self-paced reading, eyetracking and an ERP study) show that in relative clauses, increasing the distance between the relativized noun and the relative-clause verb makes it more difficult to process the relative-clause verb (the so-called locality effect). This result is consistent with the predictions of several theories (Gibson, 2000; Lewis and Vasishth, 2005), and contradicts the recent claim (Levy, 2008) that in relative-clause structures increasing argument-verb distance makes processing easier at the verb. Levy’s expectation-based account predicts that the expectation for a verb becomes sharper as distance is increased and therefore processing becomes easier at the verb. We argue that, in addition to expectation effects (which are seen in the eyetracking study in first-pass regression probability), processing load also increases with increasing distance. This contradicts Levy’s claim that heightened expectation leads to lower processing cost. Dependency- resolution cost and expectation-based facilitation are jointly responsible for determining processing cost.
We describe an eye-tracking experiment that tested the effect of syntactic predictability on skipping rates during reading. We found that plural noun phrases were skipped more often than singular noun phrases, in syntactic contexts which induced a high expectation for a plural. We interpret this effect as evidence that the plural noun phrase has been predicted ahead of time. The results indicate that the examination of skipping rates might be a useful tool for the investigation of syntactic prediction effects.
Notwithstanding conclusive psychological and corpus evidence that at least some aspects of anaphoric and referential interpretation take place incrementally, and the existence of some computational models of incremental reference resolution, many aspects of the linguistics of incremental reference interpretation still have to be better understood. We propose a model of incremental reference interpretation based on Loebner’s theory of definiteness and on the theory of anaphoric accessibility via resource situations developed in Situation Semantics, and show how this model can account for a variety of psychological results about incremental reference interpretation.
Ever since dialogue modelling first developed relative to broadly Gricean assumptions about utter-ance interpretation (Clark, 1996), it has remained an open question whether the full complexity of higher-order intention computation is made use of in everyday conversation. In this paper we examine the phenomenon of split utterances, from the perspective of Dynamic Syntax, to further probe the necessity of full intention recognition/formation in communication: we do so by exploring the extent to which the interactive coordination of dialogue exchange can be seen as emergent from low-level mechanisms of language processing, without needing representation by interlocutors of each other’s mental states, or fully developed intentions as regards messages to be conveyed. We thus illustrate how many dialogue phenomena can be seen as direct consequences of the grammar architecture, as long as this is presented within an incremental, goal-directed/predictive model.
We propose a novel dual processing model of linguistic routinisation, specifically formulaic ex- pressions (from relatively fixed idioms, all the way through to looser collocational phenomena). This model is formalised using the Dynamic Syntax (DS) formal account of language processing, whereby we make a specific extension to the core DS lexical architecture to capture the dynamics of linguistic routinisation. This extension is inspired by work within cognitive science more broadly. DS has a range of attractive modelling features, such as full incrementality, as well as recent ac- counts of using resources of the core grammar for modelling a range of dialogue phenomena, all of which we deploy in our account. This leads to not only a fully incremental model of formulaic lan- guage, but further, this straightforwardly extends to routinised dialogue phenomena. We consider this approach to be a proof of concept of how interdisciplinary work within cognitive science holds out the promise of meeting challenges faced by modellers of dialogue and discourse.
We present techniques for the incremental interpretation and prediction of utterance meaning in dialogue systems. These techniques open possibilities for systems to initiate responsive overlap behaviors during user speech, such as interrupting, acknowledging, or completing a user’s utterance while it is still in progress. In an implemented system, we show that relatively high accuracy can be achieved in understanding of spontaneous utterances before utterances are completed. Further, we present a method for determining when a system has reached a point of maximal understanding of an ongoing user utterance, and show that this determination can be made with high precision. Finally, we discuss a prototype implementation that shows how systems can use these abilities to strategically initiate system completions of user utterances. More broadly, this framework facilitates the implementation of a range of overlap behaviors that are common in human dialogue, but have been largely absent in dialogue systems.
Incremental spoken dialogue systems, which process user input as it unfolds, pose additional engineering challenges compared to more standard non-incremental systems: Their processing components must be able to accept partial, and possibly subsequently revised input, and must produce output that is at the same time as accurate as possible and delivered with as little delay as possible. In this article, we define metrics that measure how well a given processor meets these challenges, and we identify types of gold standards for evaluation. We exemplify these metrics in the evaluation of several incremental processors that we have developed. We also present generic means to optimise some of the measures, if certain trade-offs are accepted. We believe that this work will help enable principled comparison of components for incremental dialogue systems and portability of results.
We present a general model and conceptual framework for specifying architectures for incremental processing in dialogue systems, in particular with respect to the topology of the network of modules that make up the system, the way information flows through this network, how information increments are ‘packaged’, and how these increments are processed by the modules. This model enables the precise specification of incremental systems and hence facilitates detailed comparisons between systems, as well as giving guidance on designing new systems. In particular, the model can serve as a framework for specifying module communication in such systems, as we illustrate with some examples.
Language use in conversation is fundamentally incremental, and is guided by the representations that interlocutors maintain of each other’s knowledge and beliefs. While there is a consensus that interlocutors represent the perspective of others, three candidate models, a Perspective-Adjustment model, an Anticipation-Integration model, and a Constraint-Based model, make conflicting predictions about the role of perspective information during on-line language processing. Here we review psycholinguistic evidence for incrementality in language processing, and the recent methodological advance that has fostered its investigation—the use of eye-tracking in the visual world paradigm. We present visual world studies of perspective-taking, and evaluate each model’s account of the data. We argue for a Constraint-Based view in which perspective is one of multiple probabilistic constraints that guide language processing decisions. Addressees combine knowledge of a speaker’s perspective with rich information from the discourse context to arrive at an interpretation of what was said. Understanding how these sources of information combine to influence interpretation requires careful consideration of how perspective representations were established, and how they are relevant to the communicative context.
A brief introduction to the topics discussed in the special issue, and to the individual papers.