Ella Schad

2026

Investigating Reasoning with Hypotheses: The RIP2 Corpus
Ella Schad | Clara Seyfried | Chris Reed
Proceedings of the Fifteenth Language Resources and Evaluation Conference

Analyses of hypothesis generation in fictionalised environments have significant potential for exploring factors influencing reasoning and decision-making in naturalistic contexts. Based on transcripts of 16 groups playing a murder mystery game, with a total of 42 human participants, RIP2 is a 177,000 word corpus exemplifying reasoning in the forensic domain. With a 80,000 word representative sample of the corpus annotated using an argumentation framework, RIP2 is nearly twice the size of the RIP Corpus of Collaborative Hypothesis-Making (RIP1), currently the only existing corpus of hypothesis-making in group environments. With an new experimental set-up and guidelines for annotating both cases of hypothesising and conjecturing, RIP2 offers insight into how participants generate, maintain, and reject hypotheses, as well as how they interact with others’ contributions. Based on its close exploration of six groups (three successful), this corpus particularly allows for group-level comparisons of factors influencing group success. Within this paper, we discuss the main contributions for understanding hypothesising and collaborative reasoning, and offer use cases for extended work demonstrating how analysis of hypothesis generation can be used for future research on argumentation quality and decision-making.

pdf bib abs

MAD: A Corpus of Multilingual Argumentative Deliberation
Eimear Maguire | Ella Schad | Jacky Visser | Chris Reed | John Lawrence
Proceedings of the Fifteenth Language Resources and Evaluation Conference

We present a corpus of Multilingual Argumentative Deliberation (MAD), a manually annotated corpus of deliberative dialogues in English, German, Polish and Italian. Four groups each completed two variants of a ranking task, the NASA Survival Scenario; once in their native language and once in English. The corpus is annotated using Inference Anchoring Theory (IAT), a framework developed for analysing argument in dialogical settings, and widely used in argument mining. As an argument mining resource, MAD is distinct in offering equivalent instances of spontaneous argumentation across languages. In addition to use in argument mining, the annotation captures both argument relations and dialogue acts, enabling deeper analysis of argument and dialogue structure than typical of argument-only corpora. The design of the corpus enables studies of second-language effects in English-medium interaction, cross-linguistic argument comparisons for German, Polish and Italian, and speaker dialogue strategy consistency, amongst others. The primary annotated MAD corpus is freely available at https://corpora.aifdb.org/mad, while we additionally release the unannotated transcripts to facilitate repurposing of the material.

2024

pdf bib abs

Overview of DialAM-2024: Argument Mining in Natural Language Dialogues
Ramon Ruiz-Dolz | John Lawrence | Ella Schad | Chris Reed
Proceedings of the 11th Workshop on Argument Mining (ArgMining 2024)

Argumentation is the process by which humans rationally elaborate their thoughts and opinions in written (e.g., essays) or spoken (e.g., debates) contexts. Argument Mining research, however, has been focused on either written argumentation or spoken argumentation but without considering any additional information, e.g., speech acts and intentions. In this paper, we present an overview of DialAM-2024, the first shared task in dialogical argument mining, where argumentative relations and speech illocutions are modelled together in a unified framework. The task was divided into two different sub-tasks: the identification of propositional relations and the identification of illocutionary relations. Six different teams explored different methodologies to leverage both sources of information to reconstruct argument maps containing the locutions uttered in the speeches and the argumentative propositions implicit in them. The best performing team achieved an F1-score of 67.05% in the overall evaluation of the reconstruction of complete argument maps, considering both sub-tasks included in the DialAM-2024 shared task.

pdf bib abs

The RIP Corpus of Collaborative Hypothesis-Making
Ella Schad | Jacky Visser | Chris Reed
Proceedings of the 2024 Joint International Conference on Computational Linguistics, Language Resources and Evaluation (LREC-COLING 2024)

The dearth of literature combining hypothesis-making and collaborative problem solving presents a problem in the investigation into how hypotheses are generated in group environments. A new dataset, the Resolving Investigative hyPotheses (RIP) corpus, is introduced to address this issue. The corpus uses the fictionalised environment of a murder investigation game. An artificial environment restricts the number of possible hypotheses compared to real-world situations, allowing a deeper dive into the data. In three groups of three, participants collaborated to solve the mystery: two groups came to the wrong conclusion in different ways, and one succeeded in solving the game. RIP is a 49k-word dialogical corpus, consisting of three sub-corpora, annotated for argumentation and discourse structure on the basis of Inference Anchoring Theory. The corpus shows the emergent roles individuals took on and the strategies the groups employed, showing what can be gained through a deeper exploration of this domain. The corpus bridges the gap between these two areas – hypothesis generation and collaborative problem solving – by using an environment rich with potential for hypothesising within a highly collaborative space.

2022

pdf bib abs

Disagreement Space in Argument Analysis
Annette Hautli-Janisz | Ella Schad | Chris Reed
Proceedings of the 1st Workshop on Perspectivist Approaches to NLP @LREC2022

For a highly subjective task such as recognising speaker intention and argumentation, the traditional way of generating gold standards is to aggregate a number of labels into a single one. However, this seriously neglects the underlying richness that characterises discourse and argumentation and is also, in some cases, straightforwardly impossible. In this paper, we present QT30nonaggr, the first corpus of non-aggregated argument annotation, which will be openly available upon publication. QT30nonaggr encompasses 10% of QT30, the largest corpus of dialogical argumentation and analysed broadcast political debate currently available with 30 episodes of BBC’s ‘Question Time’ from 2020 and 2021. Based on a systematic and detailed investigation of annotation judgements across all steps of the annotation process, we structure the disagreement space with a taxonomy of the types of label disagreements in argument annotation, identifying the categories of annotation errors, fuzziness and ambiguity.

Co-authors

Ramon Ruiz-Dolz 1

Clara Seyfried 1

Venues

Fix author