René Knaebel

Also published as: Rene Knaebel


2023

pdf
Discourse Sense Flows: Modelling the Rhetorical Style of Documents across Various Domains
Rene Knaebel | Manfred Stede
Findings of the Association for Computational Linguistics: EMNLP 2023

Recent research on shallow discourse parsing has given renewed attention to the role of discourse relation signals, in particular explicit connectives and so-called alternative lexicalizations. In our work, we first develop new models for extracting signals and classifying their senses, both for explicit connectives and alternative lexicalizations, based on the Penn Discourse Treebank v3 corpus. Thereafter, we apply these models to various raw corpora, and we introduce ‘discourse sense flows’, a new way of modeling the rhetorical style of a document by the linear order of coherence relations, as captured by the PDTB senses. The corpora span several genres and domains, and we undertake comparative analyses of the sense flows, as well as experiments on automatic genre/domain discrimination using discourse sense flow patterns as features. We find that n-gram patterns are indeed stronger predictors than simple sense (unigram) distributions.

pdf
A Weakly-Supervised Learning Approach to the Identification of “Alternative Lexicalizations” in Shallow Discourse Parsing
René Knaebel
Proceedings of the 4th Workshop on Computational Approaches to Discourse (CODI 2023)

Recently, the identification of free connective phrases as signals for discourse relations has received new attention with the introduction of statistical models for their automatic extraction. The limited amount of annotations makes it still challenging to develop well-performing models. In our work, we want to overcome this limitation with semi-supervised learning from unlabeled news texts. We implement a self-supervised sequence labeling approach and filter its predictions by a second model trained to disambiguate signal candidates. With our novel model design, we report state-of-the-art results and in addition, achieve an average improvement of about 5% for both exactly and partially matched alternativelylexicalized discourse signals due to weak supervision.

pdf
Towards Fine-Grained Argumentation Strategy Analysis in Persuasive Essays
Robin Schaefer | René Knaebel | Manfred Stede
Proceedings of the 10th Workshop on Argument Mining

We define an argumentation strategy as the set of rhetorical and stylistic means that authors employ to produce an effective, and often persuasive, text. First computational accounts of such strategies have been relatively coarse-grained, while in our work we aim to move to a more detailed analysis. We extend the annotations of the Argument Annotated Essays corpus (Stab and Gurevych, 2017) with specific types of claims and premises, propose a model for their automatic identification and show first results, and then we discuss usage patterns that emerge with respect to the essay structure, the “flows” of argument component types, the claim-premise constellations, the role of the essay prompt type, and that of the individual author.

2022

pdf
On Selecting Training Corpora for Cross-Domain Claim Detection
Robin Schaefer | René Knaebel | Manfred Stede
Proceedings of the 9th Workshop on Argument Mining

Identifying claims in text is a crucial first step in argument mining. In this paper, we investigate factors for the composition of training corpora to improve cross-domain claim detection. To this end, we use four recent argumentation corpora annotated with claims and submit them to several experimental scenarios. Our results indicate that the “ideal” composition of training corpora is characterized by a large corpus size, homogeneous claim proportions, and less formal text domains.

pdf
Towards Identifying Alternative-Lexicalization Signals of Discourse Relations
René Knaebel | Manfred Stede
Proceedings of the 29th International Conference on Computational Linguistics

The task of shallow discourse parsing in the Penn Discourse Treebank (PDTB) framework has traditionally been restricted to identifying those relations that are signaled by a discourse connective (“explicit”) and those that have no signal at all (“implicit”). The third type, the more flexible group of “AltLex” realizations has been neglected because of its small amount of occurrences in the PDTB2 corpus. Their number has grown significantly in the recent PDTB3, and in this paper, we present the first approaches for recognizing these “alternative lexicalizations”. We compare the performance of a pattern-based approach and a sequence labeling model, add an experiment on the pre-classification of candidate sentences, and provide an initial qualitative analysis of the error cases made by both models.

2021

pdf
discopy: A Neural System for Shallow Discourse Parsing
René Knaebel
Proceedings of the 2nd Workshop on Computational Approaches to Discourse

This paper demonstrates discopy, a novel framework that makes it easy to design components for end-to-end shallow discourse parsing. For the purpose of demonstration, we implement recent neural approaches and integrate contextualized word embeddings to predict explicit and non-explicit discourse relations. Our proposed neural feature-free system performs competitively to systems presented at the latest Shared Task on Shallow Discourse Parsing. Finally, a web front end is shown that simplifies the inspection of annotated documents. The source code, documentation, and pretrained models are publicly accessible.

2020

pdf
Semi-Supervised Tri-Training for Explicit Discourse Argument Expansion
René Knaebel | Manfred Stede
Proceedings of the Twelfth Language Resources and Evaluation Conference

This paper describes a novel application of semi-supervision for shallow discourse parsing. We use a neural approach for sequence tagging and focus on the extraction of explicit discourse arguments. First, additional unlabeled data is prepared for semi-supervised learning. From this data, weak annotations are generated in a first setting and later used in another setting to study performance differences. In our studies, we show an increase in the performance of our models that ranges between 2-10% F1 score. Further, we give some insights to the generated discourse annotations and compare the developed additional relations with the training relations. We release this new dataset of explicit discourse arguments to enable the training of large statistical models.

pdf
Contextualized Embeddings for Connective Disambiguation in Shallow Discourse Parsing
René Knaebel | Manfred Stede
Proceedings of the First Workshop on Computational Approaches to Discourse

This paper studies a novel model that simplifies the disambiguation of connectives for explicit discourse relations. We use a neural approach that integrates contextualized word embeddings and predicts whether a connective candidate is part of a discourse relation or not. We study the influence of those context-specific embeddings. Further, we show the benefit of training the tasks of connective disambiguation and sense classification together at the same time. The success of our approach is supported by state-of-the-art results.

2019

pdf
Window-Based Neural Tagging for Shallow Discourse Argument Labeling
René Knaebel | Manfred Stede | Sebastian Stober
Proceedings of the 23rd Conference on Computational Natural Language Learning (CoNLL)

This paper describes a novel approach for the task of end-to-end argument labeling in shallow discourse parsing. Our method describes a decomposition of the overall labeling task into subtasks and a general distance-based aggregation procedure. For learning these subtasks, we train a recurrent neural network and gradually replace existing components of our baseline by our model. The model is trained and evaluated on the Penn Discourse Treebank 2 corpus. While it is not as good as knowledge-intense approaches, it clearly outperforms other models that are also trained without additional linguistic features.