Dialogue & Discourse (2013)


Volumes

up

bib (full) Dialogue Discourse Volume 4

We present an annotation effort that involves adding a new layer of annotation to an existing corpus. We are interested in how rhetorical relations are signalled in discourse, and thus begin with a corpus already annotated for rhetorical relations, to which we add signalling information. We show that a very large number of relations carry signals that identify them as such. The detailed, extensive analysis of signals in the corpus will aid research in the automatic parsing of discourse relations.
The article discusses several issues relevant for the annotation of written and spoken corpus data with information structure. We discuss ways to identify focus top-down (via questions under discussion) or bottom-up (starting from pitch accents). We introduce a two-dimensional labelling scheme for information status and propose a way to distinguish between contrastive and non-contrastive information. Moreover, we take side in a current debate, claiming that focus is triggered by two sources: newness and elicited alternatives (contrast). This may lead to a high number of semantic-pragmatic foci in a single sentence. In each prosodic phrase there can be one primary focus (marked by a nuclear pitch accent) and several secondary foci (marked by weaker prosodic prominence). Second occurrence focus is one instance of secondary focus.
This paper briefly describes the Turkish Discourse Bank, the first publicly available annotated discourse resource for Turkish. It focuses on the challenges posed by annotating Turkish, a free word order language with rich inflectional and derivational morphology. It shows the usefulness of the PDTB style annotation but points out the need to expand this annotation style with the needs of the target language.
In contrast to classical lexical semantic relations between verbs, such as antonymy, synonymy or hypernymy, presupposition is a lexically triggered semantic relation that is not well covered in existing lexical resources. It is also understudied in the field of corpus-based methods of learning semantic relations. Yet, presupposition is very important for semantic and discourse analysis tasks, given the implicit information that it conveys. In this paper we present a corpus-based method for acquiring presupposition-triggering verbs along with verbal relata that express their presupposed meaning. We approach this difficult task using a discriminative classification method that jointly determines and distinguishes a broader set of inferential semantic relations between verbs. The present paper focuses on important methodological aspects of our work: (i) a discriminative analysis of the semantic properties of the chosen set of relations, (ii) the selection of features for corpus-based classification and (iii) design decisions for the manual annotation of fine-grained semantic relations between verbs. (iv) We present the results of a practical annotation effort leading to a gold standard resource for our relation inventory, and (v) we report results for automatic classification of our target set of fine-grained semantic relations, including presupposition. We achieve a classification performance of 55% F1-score, a 100% improvement over a best-feature baseline.
Situated dialogic corpora are invaluable resources for understanding the complex relationship between language, perception, and action as they are based on naturalistic dialogue situations in which the interactants are given shared goals to be accomplished in the real world. In such situations, verbal interactions are intertwined with actions, and shared goals can only be achieved via dynamic negotiation processes based on common ground constructed from discourse history as well as the interactants’ knowledge about the status of actions. In this paper, we propose four major dimensions of collaborative tasks that affect the negotiation processes among interactants, and, hence, the structure of the dialogue. Based on a review of available dialogue corpora and annotation manuals, we show that existing annotation schemes so far do not adequately account for the complex dialogue processes in situated task-based scenarios. We illustrate the effects of specific features of a scenario using annotated samples of dialogue taken from the literature as well as our own corpora, and end with a brief discussion of the challenges ahead.
Discourse structure and discourse relations are an important ingredient in systems for the analysis of text that go beyond the boundary of single clauses. Discourse relations often indicate important additional information about the connection between two clauses, such as causality, and are widely believed to have an influence on aspects of reference resolution.In this article, we first present the general design choices that are to be made in the design of an annotation scheme for discourse structure and discourse relations. In a second part, we present the scheme used in our annotation of selected articles from the TüBa-D/Z treebank of German (Telljohann et al., 2009). The scheme used in the annotation is theory-neutral, but informed by more detailed linguistic knowledge in the way of linguistic tests that can help disambiguate between several plausible relations.
This paper deals with the annotation of "aboutness topic" (also known as "sentence topic") in naturally occurring data. We report on two annotation experiments in which relatively poor inter-rater agreement was attained for the annotation of topics, although the coders were adhering to the same annotation instructions in each experiment. After presenting some theoretical background on the notion of topic in linguistics, we present the first experiment. Tokens that prove particularly difficult to assess in that experiment are identified, systematized, and discussed in some detail. In sum, the cases that were most likely to lead to non-matching annotations are those that either require a decision between "thetic" or "topic-comment", or involve an overlap between focus and topic. In order to try and increase inter-rater agreement, we modified the annotation guidelines; trying to eliminate some of the confounds from the first experiment. We then trained other annotators to use the modified guidelines and set them an annotation task. Again, the degree of inter-rater agreement was slightly disappointing. We discuss what we believe to be the problem cases in this task and give some guidance for future modification of the guidelines. The findings raise a number of issues that may contribute to the discussion in theoretical linguistics, and they also may alert other researchers planning a similar enterprise to some pitfalls they may encounter.
We introduce a corpus of science journalism articles, categorized in three levels of writing quality. The corpus fulï¬lls a glaring need for realistic data on which applications concerned with predicting text quality can be developed and evaluated. In this article we describe how we identiï¬ed, guided by the judgements of renowned writers, samples of extraordinarily well-written pieces and how these were expanded to a larger set of typical journalistic writing. We provide details about the corpus and the text quality evaluations it can support. Our intention is to further extend the corpus with annotations of phenomena that reveal quantiï¬able differences between levels of writing quality. Here we introduce two of the many types of annotation on the sentence level that distinguish amazing from typical writing: text generality/speciï¬city and communicative goal. We explore the feasibility of acquiring annotations automatically, and verify that such features are indeed predictive of writing quality. We ï¬nd that the annotation of general/speciï¬c on sentence level can be performed reasonably accurately fully automatically, while automatic annotations of communicative goal reveals salient characteristics of journalistic writing but does not align with categories we wish to annotate in future work.
The various meanings of discourse connectives like while and however are difficult to identify and annotate, even for trained human annotators. This problem is all the more important that connectives are salient textual markers of cohesion and need to be correctly interpreted for many NLP applications. In this paper, we suggest an alternative route to reach a reliable annotation of connectives, by making use of the information provided by their translation in large parallel corpora. This method thus replaces the difficult explicit reasoning involved in traditional sense annotation by an empirical clustering of the senses emerging from the translations. We argue that this method has the advantage of providing more reliable reference data than traditional sense annotation. In addition, its simplicity allows for the rapid constitution of large annotated datasets.
I present arguments in favor of the Uniformity Hypothesis: the hypothesis that discourse can extend syntax dependencies without conflicting with them. I consider arguments that Uniformity is violated in certain cases involving quotation, and I argue that the cases presented in the literature are in fact completely consistent with Uniformity. I report on an analysis of all examples in the Copenhagen Dependency Treebanks involving violations of Uniformity. I argue that they are in fact all consistent with Uniformity, and conclude that the CDT should be revised to reflect this.
This paper reviews annotation schemes used for labeling discourse coherence in well-formed and noisy (essay) data, and it describes a system that we have developed for automated holistic scoring of essay coherence. We review previous, related work on unsupervised computational approaches to evaluating discourse coherence and focus on a taxonomy of discourse coherence schemes classified by their different goals and types of data. We illustrate how a holistic approach can be successfully used to build systems for noisy essay data, across domains and populations. We discuss the model features related to human scoring guide criteria for essay scoring, and the importance of using model features relevant to these criteria for the purpose of generating meaningful scores and feedback for students and test-takers. To demonstrate the effectiveness of a holistic annotation scheme, we present results of system evaluations.
We present the AAWD and AACD corpora, a collection of discussions drawn from Wikipedia talk pages and small group IRC discussions in English, Russian and Mandarin. Our datasets are annotated with labels capturing two kinds of social acts: alignment moves and authority claims. We describe these social acts, describe our annotation process, highlight challenges we encountered and strategies we employed during annotation, and present some analyses of resulting data set which illustrate the utility of our corpus and identify interactions among social acts and between participant status and social acts and in online discourse.
Purver and Ginzburg introduce the Reprise Content Hypothesis (RCH) and use it to argue for a non-generalized quantifier approach to certain quantifiers. In previous work we contrasted their approach with an approach which employs a more classical generalized quantifier analysis. In the present paper we synthesize the two approaches and suggest that this gives us the best account of the dialogue phenomena associated with RCH.